SOME  CONTRIBUTIONS  TO 
SMALL  AREA  ESTIMATION 


By 

KARABI  SINHA 


A DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 
OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 
OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 

UNIVERSITY  OF  FLORIDA 


2004 


Copyright  2004 
by 

Karabi  Sinha 


To  Baba  and  Maa. 


ACKNOWLEDGMENTS 


My  sincerest  gratitude  is  to  my  advisor,  Professor  Malay  Ghosh,  for  his 
unyielding  support  and  invaluable  guidance  throughout  my  graduate  work.  None  of 
this  would  have  been  possible  without  him. 

My  sincere  thanks  are  due  to  Professor  Dalho  Kim  for  the  immense  help  I 
received  from  him  in  computation.  I would  also  like  to  thank  all  my  committee 
members.  Professors  Ronald  Randles,  George  Casella,  Gyndi  Garvan  and  Bruce 
Vogel  for  their  constructive  comments  regarding  my  dissertation. 

There  are  some  without  whom  this  journey  would  not  have  been  possible. 

From  my  undergraduate  days,  I owe  everything  to  my  teacher.  Professor  Kashinath 
Chatterjee.  His  love  for  Statistics,  for  his  students,  his  art  of  teaching  (yes,  he 
makes  it  an  art)  and  his  devotion  inspired  me  (and  still  continue  to  do  so). 

At  the  University  of  Florida,  I have  been  very  fortunate  to  have  had  Gyndi 
Garvan  as  my  mentor  in  Biostatistics.  She  has  been  extremely  kind,  encouraging 
and  inspiring.  My  experiences  as  a statistician  in  different  real-life  applications  and 
projects  have  proved  to  be  invaluable  and  I owe  it  all  to  her. 

Finally,  there  are  some  without  whom  my  life  would  not  have  been  complete. 
Dola-mashi  not  only  made  this  a “home  away  from  home”  but  she  became  my  best 
friend.  She  is  a pillar  of  strength  and  touches  the  lives  and  hearts  of  all  that  come 
in  contact  with  her.  She  is  a true  friend  and  I feel  honored  and  grateful  to  have  her 
in  my  life. 

My  dear  friends.  Carmen,  Toni  and  Belkys,  my  fellow  partners  in  the  toils  and 
joys  of  graduate  school  life,  are  like  my  sisters  and  I consider  myself  fortunate  to 
have  them  in  my  life. 


IV 


My  little  brother,  one-and-a-half  years  younger  to  me,  is  my  buddy,  my 
partner,  my  joy  and  my  constant  source  of  inspiration. 

Without  my  mother,  I don’t  think  I would  have  made  it.  If  ever  caught  in 
a gloomy  moment  or  a moment  of  doubt,  she  would  be  the  person  to  talk  to. 

Maa  urges,  encourages,  inspires  and  will  not  hear,  for  a moment,  anything  in  the 
negative.  She  is  the  most  positive  person  I know  and  sees  me  through  my  gloomy 
days  and  takes  pride  in  my  best. 

Finally,  to  the  one  who  champions  me  through  everything,  to  the  one  I look  up 
to  for  everything,  to  my  father,  to  Baba,  I owe  everything. 


V 


TABLE  OF  CONTENTS 

pase 

ACKNOWLEDGMENTS iv 

LIST  OF  TABLES viii 

LIST  OF  FIGURES x 

ABSTRACT  xi 

CHAPTER 

1 SMALL  AREA  ESTIMATION  : AN  OVERVIEW  AND  A SELECTIVE 

REVIEW 1 

1.1  Introduction 1 

1.2  SAE  Techniques  : A Brief  Description 3 

1.3  Some  Illustrative  Examples 5 

1.4  Some  Major  References 5 

1.5  Use  of  EB  and  HB  Estimators  in  SAE 6 

1.6  Discussion  of  the  Work  of  Fay  and  Herriot 14 

1.7  Considerations  for  the  Calculation  of  Bayes  Risk  of  the  EB  esti- 
mator   17 

1.8  HB  Estimators  and  Corresponding  Bayes  Risks 18 

1.9  Normality-based  Prediction  20 

1.9.1  Prediction  in  Multivariate  Normal  Population 20 

1.9.2  Bayes  Prediction  of  Domain  Means  in  Finite  Population 

Sampling 21 

1.9.3  EB  Prediction  of  Domain  Means  in  Finite  Population  Sam- 
pling   22 

1.10  Use  of  Variance  Components  Models  in  SAE 24 

1.10.1  Prediction  in  Variance  Components  Model 25 

1.10.2  Estimation  of  cr^ 26 

1.10.3  Estimation  of  cr^ 26 

1.10.4  Prediction  on  Uj 27 

1.10.5  Prediction  on  7j 27 

1.11  Measurement  Error  Models 29 

1.12  Discussion  of  the  Work  of  Ghosh  et  al  (1998) 30 

1.13  Salient  Features  and  Layout  of  this  Dissertation 31 


VI 


2 EMPIRICAL  BAYES  ESTIMATION  IN  FINITE  POPULATION  SAM- 
PLING 

UNDER  FUNCTIONAL  MEASUREMENT  ERROR  MODELS 33 

2.1  Introduction 33 

2.2  EB  Estimators 35 

2.3  Bayes  Risks 38 

2.4  Simulation  Study 41 

3 EMPIRICAL  AND  HIERARCHICAL  BAYES  ESTIMATION 
IN  FINITE  POPULATION  SAMPLING 

UNDER  STRUCTURAL  MEASUREMENT  ERROR  MODELS 49 

3.1  Introduction 49 

3.2  EB  Estimators 50 

3.3  Asymptotic  Optimality  of  the  EB  Predictor  55 

3.4  HB  Predictors 56 

3.5  Simulation  Study 60 

3.6  Data  Analysis 61 

4 EMPIRICAL  AND  HIERARCHICAL  BAYES  ESTIMATION  FOR  BI- 
NARY RESPONSE 69 

4.1  Introduction 69 

4.2  HB  Model 70 

4.3  EB  estimation 72 

4.4  Data  analysis 83 

4.4.1  Selection  of  covariates 84 

4.4.2  Small  Domain  Estimates  for  Asians 85 

5 CONCLUDING  REMARKS  AND  SCOPE  FOR  FUTURE  WORK  ...  104 

APPENDIX  105 

REFERENCES 126 

BIOGRAPHICAL  SKETCH 129 


vii 


LIST  OF  TABLES 

Table  page 

2-1  The  sample  sizes  (nj),  the  population  (“true”)  means  (TM),  the  sam- 
ple means  (SM),  the  regression  estimates  (R),  the  empirical  Bayes  es- 
timates (EB)  and  the  corresponding  RMSE’s  for  the  12  strata  when 
0-^  = 0 43 

2-2  The  population  means  (TM),  the  sample  means  (SM),  the  regression 
estimates  (R),  the  empirical  Bayes  estimates  (EB),  and  the  correspond- 
ing RMSE’s  for  the  12  strata  when  cr^  = 2.5 44 

2-3  The  population  means  (TM),  the  sample  means  (SM),  the  regression 
estimates  (R),  the  empirical  Bayes  estimates  (EB),  and  the  correspond- 
ing RMSE’s  for  the  12  strata  when  cr^  = 4 45 

2- 4  The  population  mean,  the  3 estimators  and  their  RMSE’s  for  the  12 

counties  when  bi  = 5 and  = 4.5 46 

3- 1  The  sample  sizes,  means  and  RMSE’s  for  the  12  counties  61 

3-2  Survey  and  Satellite  Data  for  Soybeans  in  12  Iowa  counties 62 

3-3  Predicted  Hectares  of  Soybean  With  Standard  Errors  of  Alternative 

Predictors  63 

3-4  Predicted  Hectares  of  Soybean  With  Standard  Errors  of  Alternative 

Predictors,  using  soybean  pixels  as  the  only  covariate 65 

3- 5  Predicted  Hectares  of  Soybean  (HB  and  BHF)  With  Corresponding 

Standard  Errors,  using  soybean  pixels  as  the  only  covariate 66 

4- 1  Definition  of  Domains  for  Asians 89 

4-2  Definition  of  Domains  for  Asians 90 

4-3  Definition  of  Domains  for  Asians 91 

4-4  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1997  92 

4-5  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

1997  (continued)  93 

4-6  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

1997  (continued) 94 

viii 


4-7  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1998  95 


4-8  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

1998  (continued) 96 

4-9  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

1998  (continued) 97 

4-10  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1999  98 

4-11  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

1999  (continued) 99 

4-12  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

1999  (continued) 100 

4-13  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

2000  101 

4-14  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

2000  (continued) 102 

4-15  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 

2000  (continued) 103 


IX 


LIST  OF  FIGURES 


Figure 

page 

2-1  

47 

2-2  

48 

X 


Abstract  of  Dissertation  Presented  to  the  Graduate  School 
of  the  University  of  Florida  in  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of  Doctor  of  Philosophy 

SOME  CONTRIBUTIONS  TO 
SMALL  AREA  ESTIMATION 

By 

Karabi  Sinha 
August  2004 

Chair:  Malay  Ghosh 
Major  Department:  Statistics 

This  dissertation  looks  at  the  existing  facets  of  small  area  estimation  and 
considers  contribution  to  some  particular  areas. 

One  such  instance  is  the  problem  of  estimation  in  the  small  area  setup  where 
the  covariates  are  measured  with  error.  In  other  words,  it  considers  the  role  of 
measurement  error  models  in  small  area  estimation. 

In  the  majority  of  this  dissertation,  we  consider  simultaneous  estimation  of 
finite  population  means  for  several  strata  based  on  two  different  model  structures 
and  assumptions.  In  each  consideration,  a model-based  approach  is  taken,  where 
the  covariates  in  the  super-population  model  are  subject  to  measurement  errors. 

In  the  first  set-up,  EB  estimators  of  the  strata  means  are  developed  and  an 
asymptotic  expression  of  the  Mean  Square  Error  of  the  vector  of  EB  estimators  is 
attained.  In  the  second  set-up,  we  consider  developing  both  EB  and  HB  estimators 
of  the  strata  means.  In  both  cases,  findings  are  supported  by  appropriate  data 
analyses  and  are  further  validated  by  simulation  studies. 

Also,  in  this  dissertation,  we  have  considered  small  domain  estimates  of  health 
insurance  coverage  of  minority  superpopulations.  We  have  considered  here  a 


XI 


design-assisted  model-based  approach.  Both  the  EB  and  the  HB  estimators  are 
developed  and  asscociated  measures  of  precision  are  also  found. 


CHAPTER  1 

SMALL  AREA  ESTIMATION  : AN  OVERVIEW  AND  A SELECTIVE  REVIEW 


Preamble:  The  primary  objective  of  this  dissertation  is  to  address  small  area 
estimation  problems  where  the  auxiliary  variables  are  subject  to  measurement  er- 
ror. Both  functional  and  structural  measurement  error  models  are  considered.  The 
former  refers  to  the  situation  where  the  actual  covariates  are  nonstochastic,  while 
the  latter  refers  to  the  case  when  they  are  stochastic.  Empirical  and  Hierarchical 
Bayes  (EB  and  HB)  estimators  are  developed  based  on  these  models.  In  addition, 
we  have  also  considered  the  problem  of  small  domain  estimation  with  binary  data 
where  once  again,  EB  and  HB  estimators  are  developed. 

1 . 1 Introduction 

We  begin  by  qualifying  the  phrase  “Small  Area” . These  are  areas  which  are 
not  necessarily  small  in  a geographic  sense,  but  rather  small  in  terms  of  the  number 
of  data  points  captured  in  these  in  an  actual  sample  survey  context.  Small  areas 
may  also  describe  a “small  domain”,  i.e.  a small  subpopulation  such  as  a specific 
age-sex-race  group  of  people  within  a large  geographic  area.  In  this  dissertation,  we 
will  use  these  terms  interchangably. 

An  area  estimate  is  usually  referred  to  as  a “direct  estimate”  if  it  is  based 
only  on  the  specific  sample  data  coming  from  that  area.  Because  the  number  of 
data  points  in  such  small  areas  is  usually  very  small,  direct  estimators  will  not 
be  reliable  at  all.  In  particular,  such  estimators  are  subject  to  large  standard 
errors  and  co-efficients  of  variation.  Quite  understandably,  in  such  situations, 
the  challenge  before  a survey  statistician  is  to  go  beyond  the  use  of  traditional 
finite  population  inference  tools.  The  search  for  alternative  estimators  which  are 


1 


2 


better  than  these  direct  estimators  has  opened  up  extensive  statistical  research 
on  the  topic  commonly  known  as  Small  Area  Estimation.  We  will  abbreviate  this 
as  SAE.  It  has  been  argued  that  in  such  situations  where  the  sample  size  is  so 
small,  a reasonable  solution  for  SAE  may  be  based  on  the  principle  of  borrowing 
strength  from  neighboring  areas  i.e.,  those  having  similar  characteristics.  The  idea 
is  to  effectively  increase  the  sample  size  so  that  an  indirect  estimator  based  on  a 
larger  sample  may  be  derived.  In  most  situations  the  gain  is  in  terms  of  increased 
precision  of  the  resulting  estimators. 

While  tracing  the  early  history  of  development  of  the  concept  and  use  of  Small 
Area  Statistics,  to  be  abbreviated  as  SAS,  we  notice  that  such  statistics  existed  as 
early  as  in  Eleventh  Century  England  as  also  in  Seventeenth  Century  Canada  based 
on  either  census  or  administrative  records.  Further  to  this,  demographers  have  long 
been  using  a variety  of  indirect  methods  for  SAE  of  population.  Typically,  sampling 
is  not  involved  in  the  traditional  demographic  methods. 

It  must  be  emphasized  that  in  recent  times,  demand  for  SAS  has  greatly  in- 
creased worldwide.  This  is  due  to  their  growing  use  in  formulating  / implementing 
governmental  policies  and  programs,  in  the  allocation  of  government  funds  towards 
regional  planning  and  uses  in  the  private  sectors  as  well.  The  business  decisions, 
particularly  those  relating  to  small  businesses,  rely  heavily  on  local  socio-economic 
conditions. 

We  will  devote  the  rest  of  this  chapter  to  tracing  the  history  of  SAE  and 
discuss  details  of  those  works  that  are  relevant  to  the  development  of  our  own  work 
in  the  chapters  to  follow.  In  Section  1.2,  we  look  at  the  development  of  SAE  over 
the  last  few  decades  and  discuss  some  of  the  popular  techniques  in  SAE.  Section 
1.3  looks  at  some  of  the  areas  of  application  of  SAE.  Section  1.4  discusses  some  of 
the  major  publications  of  SAE.  In  Section  1.5,  we  provide  a detailed  discussion  on 
Bayes  and  Empirical  Bayes  estimators  in  standard  models.  Bayesian  methods  have 


3 


proved  to  be  particularly  well  suited  for  the  development  of  small  area  estimators 
and  is  the  primary  tool  used  in  our  work.  Alongside  with  development  of  these 
estimators/predictors  in  SAE,  there  have  been  as  much  interest  and  research  done 
on  improving  the  corresponding  estimates  of  the  mean  square  errors  (Bayes  risks) 
of  these  estimators/predictors.  Some  of  these  works  are  discussed  in  Sections  1.7 
and  1.8.  In  Sections  1.6  and  1.10,  we  take  up  two  areas  of  application  of  SAE 
stated  earlier  in  Section  1.3,  and  discuss  in  details  the  works  of  Fay  and  Herriot 
and  Battese,  Harter,  and  Fuller  (henceforth  to  be  abbreviated  as  BHF)  in  these 
contexts.  Much  of  their  work  have  led  to  the  conceptualization  and  development  of 
later  works  and  eventually,  to  ours.  Thus,  it  seems  necessary  to  discuss  their  work 
in  some  detail  as  natural  precursors  to  our  own.  However,  since  much  of  BHF’s 
work  and  our  own  deal  with  prediction,  we  devote  Section  1.9  to  discussing  the 
concept  of  normality-based  prediction.  An  underlying  theme  in  the  majority  of 
our  work  is  the  concept  of  “measurement  error”.  So,  we  devote  Section  1.11  to 
the  discussion  of  Measurement  Error  Models.  In  Section  1.12,  we  discuss  the  work 
of  Ghosh  et  al  (1998)  as  another  precursor  to  a part  of  our  own  work.  Finally,  in 
Section  1.13,  we  will  discuss  the  salient  features  of  this  dissertation  which  addresses 
certain  areas  of  current  importance  in  SAE.  In  particular,  we  propose  to  develop 
Empirical  and  Hierarchical  Bayesian  methods  when  covariates  in  the  assumed 
super-population  model  are  measured  with  error.  We  also  propose  to  provide 
Empirical  and  HB  procedures  in  a Small  Area  set-up  where  the  response  is  binary 
in  nature. 

1.2  SAE  Techniques  : A Brief  Description 

Early  use  of  indirect  estimators  are  Synthetic  and  Composite  estimators.  A 
Synthetic  estimate  of  a small  area  mean,  in  the  absence  of  auxiliary  information,  is 
the  overall  sample  mean  for  all  small  areas.  Clearly  such  an  estimator  is  severely 


4 


biased.  In  contrast,  in  this  case,  a composite  estimate  of  a small  area  mean  is  the 
weighted  average  of  the  sample  mean  for  that  small  area  and  the  overall  mean.  The 
choice  of  weights  is  discussed,  among  others,  in  Schaible  (1978). 

In  the  presence  of  auxiliary  information,  a composite  estimator  of  a small  area 
mean  is  a weighted  average  of  the  direct  survey  estimator  and  some  regression 
estimator.  On  the  other  hand,  a synthetic  estimator  puts  the  entire  weight  on 
the  regression  estimator.  Synthetic  SAE  have  been  proposed  for  example,  by 
Erickson  (1974)  and  Cronkite  (1987)  to  derive  respectively  population  estimates 
and  employment/unemployment  estimates  for  small  areas.  Estimates  of  this  kind 
have  been  proposed  also  for  the  study  of  Disability  (1968),  Periodic  Household 
Surveys  (1977)  and  Drug  Abuse  Survey  (1979).  Rao  (2003)  provides  an  account  of 
these.  However,  composite  estimates  are  more  in  vogue  than  synthetic  estimates 
in  the  SAE  context  for  reasons  discussed  earlier.  Following  are  some  composite 
methods  introduced  in  the  context  of  SAE.  Erickson  (1974)  and  Cronkite  (1987), 
among  others,  used  regression  methods  to  derive  respectively  population  estimates, 
and  employment  and  unemployment  estimates  for  small  areas.  Fay  and  Herriot 
(1979)  suggested  the  use  of  James-Stein  Estimator  while  Fay  (1987)  suggested 
the  use  of  multivariate  regression  in  the  context  of  SAE.  Later  we  will  discuss  the 
Fay-Herriot  paper  in  some  detail.  Use  of  variance  components  models  in  the  study 
of  SAE  was  suggested  by  Fuller  and  Harter  (1987)  and  an  application  was  discussed 
in  Battese  et  al  (1988).  We  will  elaborate  on  this  work  later.  Use  of  linear  models 
and  generalized  linear  models  in  the  context  of  SAE  were  emphasized  by  Cressie 
(1990)  and  Ghosh  et  al  (1998)  respectively. 

Pfeffermann  and  Burck  (1990)  and  Rao  and  Yu  (1992)  studied  the  extent  of 
possible  use  of  time  series  and  cross-sectional  models  in  SAE.  Sensitivity  Analysis 
of  SAE  was  carried  out  by  Erickson  and  Kadane  (1987). 


5 


In  the  following  section,  we  provide  a list  of  examples  where  composite  estimators 
are  used. 

1.3  Some  Illustrative  Examples 

In  this  section,  we  provide  a partial  list  of  some  of  the  applications  of  SAE 
techniques  to  real  life  data  based  on  composite  estimators. 

(a)  Small  Area  Income  and  Poverty  Estimation:  It  is  an  on-going  project  of  the 
US  Bureau  of  Census  where  the  objective  is  to  estimate  the  median  income 
as  well  as  the  proportion  of  people  in  poverty  for  small  areas  like  counties, 
census  tracts,  and  school  districts.  A useful  reference  in  this  context  is 
Ericksen,  E.P.  and  Kadane,  J.B.  (1987).  Sensitivity  analysis  of  local  estimates 
of  undercount  in  the  1980  U.S.  Census.  In  Small  Area  Statistics.  Eds.  R. 
Platek,  J.N.K.  Rao,  C.E.  Sarndal,  and  M.P.  Singh,  pp  23-45.  Wiley,  New 
York. 

(b)  Estimation  of  Per  Capita  Income  [PCI]  for  Small  Places  : First  started  with 
use  of  Ratio  Estimators  and  then  later  improved  with  EB  estimators  using 
Regression  Models  (Fay  and  Herriot,  1979). 

(c)  Estimation  of  Areas  under  Corn  & Soybeans  in  Counties  with  scanty  data  : 
Use  of  Variance  Components  Model  (Battese,  Harter  and  Fuller,  1988). 

(d)  Estimation  of  Median  Income  of  four-person  families  in  the  USA  : First 
started  with  the  use  of  regression  and  then  improved  by  using  EB  approach. 
Later  this  was  further  improved  by  using  time  series  methods  (Ghosh,  Nangia 
and  Kim,  1996). 

1.4  Some  Major  References 

Apart  from  the  referenced  articles  in  Section  1.2,  there  are  many  articles  in 
this  area  of  theoretical  research  along  with  applications  involving  real  data  sets. 
Major  journal  sources  are  : (a)  Journal  of  American  Statistical  Association;  (b) 


6 


Survey  Methodology;  (c)  International  Statistical  Review;  (d)  Statistical  Science 
and  (e)  US  Govt.  Reports. 

There  are  two  Review  Articles  : 

(a)  Chaudhuri  (1994)  : Small  domain  statistics:  A review  (Statistica  Neer- 
landica) ; 

(b)  Ghosh  and  Rao  (1994)  : Small  Area  Estimation  : An  appraisal  (Statistical 
Science). 

Very  recently,  Wiley  published  a book  authored  by  Rao  (2003)  : Small  Area 
Estimation. 

There  is  an  edited  volume  by  Platek  and  Singh  (1986)  and  another  edited 
volume  by  Platek,  Rao,  Sarndal,  and  Singh  (1987).  Both  are  Wiley  Editions.  The 
latter  Volume  contains  articles  on  : Policy  Issues;  Employment  and  Unemployment 
Estimates;  Gensus  Undercount  - apart  from  the  areas  and  topics  mentioned  in 
Section  1.2. 


1.5  Use  of  EB  and  HB  Estimators  in  SAE 

EB  and  HB  methods  are  particularly  well-suited  for  developing  small  area 
estimators.  One  of  the  main  features  of  EB  estimators  is  that  it  is  able  to  “borrow 
strength  from  the  ensemble”  i.e.,  use  information  from  similar  sources  in  construct- 
ing estimators  or  predictors  in  addition  to  the  most  directly  available  source  of 
information. 

According  to  the  EB  approach,  first  a Bayes  estimator  of  the  unknown  param- 
eter of  interest  is  obtained  by  using  a prior  on  it.  The  Bayes  estimate,  however,  is 
a function  of  the  prior  parameters.  These  parameters  are  then  estimated  by  some 
classical  method  like  the  method  of  moments,  method  of  maximum  likelihood  or 


7 


some  combination  thereof  and  are  plugged  in  the  expression  for  the  Bayes  esti- 
mator. The  resulting  expression  is  the  so-called  EB  estimator  of  the  parameter  of 
interest. 

The  main  focus  of  our  work  is  the  estimation  (or  rather,  prediction)  of  a finite 
population  mean  under  certain  super-population  models  using  Bayes  (B)  and 
EB  predictors.  In  the  following  sub-sections,  we  provide  a detailed  discussion  on 
Bayes  and  EB  estimators  by  following  a logical  sequence  of  specifications  of  some 
underlying  models. 

Bayes  and  EB  Estimators  in  Normal  Models 

Assume  Yi\Bi  N{6i,  Di)\  Di  known  whereas  prior  distribution  for  6i  is  given 
by  9i  A’(0,  A),  A known. 

In  a small  area  context,  Yi  is  the  direct  estimator  of  the  small  area  mean  0.;.  With 
this  model, 

(1)  e^Y,  N [AY,/{A  + A),  AA/(A  + A)] 

(2)  Thus,  the  Bayes  estimator  of  9^  (which  is  the  posterior  mean  of  9i  under 

squared  error  loss)  is  given  by  : 9f  = ^ ~ 

This  is  true  for  any  number  of  observations  obeying  the  normalilty-based  models 
described  above. 

The  problems  in  this  context  center  around  different  model  specifications  for 
the  T’s  and  the  9's,  and  in  situations  where  the  parameters  in  their  distributions 
are  known/unknown. 

In  an  EB  scenario,  the  prior  variance  and  other  parameters  need  to  be 
estimated  from  the  marginal  distribution  of  the  yj’s. 

In  what  follows,  we  will  elaborate  on  various  model  specifications  based  on  the 


above  model. 


(a)  Special  Case  : Dj  = D,  known 


We  begin  by  studying  the  special  case,  namely  that  the  DiS  are  all  equal  to  D and 
known. 

Set-up  ; Assume  F,  N{6i,  D);  D known;  z = 1, 2, . . . , m (>  2); 

9i  N{0,  A),  A unknown;  i = 1,2, . . . ,m. 

Then  for  each  i, 

(a.l)  9i\Y,  ~ N [AY,/ {A  A D),AD/{A  + D)] 

(a.2)  Thus,  9f  = AYi/{A  + D)  provided  A is  known 
(a.3)  Marginally,  Fj  ^"(0,  A + D) 

(a.4)  Therefore,  Y^/{A  + D)  ~ Xm- 

(a.5)  E[l/x„]  = l/(m  — 2)  [ > 0 since  m > 2],  So  we  may  write 
(a.6)  E[im-2)/j:^Y^^]  = l/[A  + D] 

(a.7)  We  now  re-write  9f  in  (a.2)  above  as  : 9f  = [1  — D/{A  -k  D)]  Yi 


Y 


i = 1,2, . . . ,m 


(a.8)  Finally,  9f^  = 1 - D{m  - 2)/  Yj 
Remark  1.1  : There  is  a sense  of  “borrowing  strength”  in  the  above  expression  for 
the  EB  estimator  of  9i  for  each  i.  The  procedure  is  based  on  an  unbiased  estimator 
of  [1/{A  + D)].  The  estimators  9f^  are  the  Stein  estimators. 


(b)  General  Case  : D/s  are  all  known  but  unequal 

Set-up  : Assume  Fj  N{9i,  A);  A known;  z = 1, 2, . . . , m (>  2) 

9,  A'(0,  A),  A unknown;  z = 1, 2, . . . , m 

Then  for  each  z, 

(b.l)  e,\Y  N [AY/{A  + A).  AD,/ {A  + A)] 

(b.2)  Therefore,  9f  = AYi/[A  -|-  A)  = [1  ~ E>i/{A  + A)]  Yi,  provided  A is  known. 
(b.3)  Marginally,  Fj  A(0,  A + D,) 


(b.4)  Therefore,  [Y^/{A  + D,)\  ~ Xm- 


9 


(b.5)  So,  equating  observed  to  its  expected  value,  we  obtain  an  equation: 


E [K.V(^ + A)]  = 


m 


from  which  A can  be  solved.  Since  + A)]  lies  in  between 

+ Dmax)]  and  + -Omm)]  we  can  start  with  an  initial 

value  for  A by  writing  : m = + Dme*anJ-  Then  we  recursively  solve 

for  A. 

(b.6)  Finally,  df^  = 1 - Di/{A  + Di)  Yi,  z = 1, 2, . . . , m 

Remark  1.2  : Here  also  there  is  a sense  of  “borrowing  strength”  for  “estimating” 
A.  Note  that  an  exact  unbiased  estimator  is  not  obtainable  for  I /{A  + Di)  for  every 
i.  Therefore,  we  settle  for  an  “estimate”  of  A based  on  the  entire  data  set,  keeping 
in  mind  that  A occurs  in  the  denominator  for  Of . Then  we  take  recourse  to  the 
“plug-in”  principle.  Of  course,  the  Stein-effect  is  still  there. 


(c)  General  Case  : IIP  priors  on  Oj  with  non-zero  mean  but  DjS  are  all  equal  and 
known 

Set-up  : Assume  Yi  N {0i,  D)\  D known;  z = 1,  2, . . . , m (>  2) 

9i  N{y,  A);  z = 1,  2, . . . , m;  A and  A both  unknown. 

Then  for  each  z, 

(c.l)  Bi\Y, N [{AY,  + Dv)l{A  + D),AD/{A  + D)] 

(c.2)  Therefore,  Of  = {AY,  + Du) /{A  + D)  provided  both  u and  A are  known.  In 
case  u and  A are  unknown,  we  re-write  this  expression  as 

Of  = y,[l  - D/{A  + Z))]  + u[D/{A  + D)]. 

(c.3)  Marginally,  N {u,  A + D)  ]i  = 1,2, , m 

(c.4)  Therefore,  Jf  - [{Yi  — Y^ /{A  + D)]  = Xm-i-  Moreover,  similar  to  (a. 5)  and 
(a. 6),  we  may  deduce  that  E[{;m  - ‘i)/^j{Yj  - y)^]  = 1/[A  -t-  D],  provided 
m > 3. 


10 


(c.5)  So,  equating  observed  to  its  expected  value,  we  obtain  an  equation; 


[m 


(c.6)  Further,  v = Y and  Y is  independent  of  ~ Hence,  finally,  for 

i = 1,2, ...  ,m,  from  (c.2)  and  (c.5). 


e. 


EB 


K: 


I -Dim-  3)/  (^D{m  - 3)/  j 


Here,  again,  there  is  a sense  of  borrowing  strength  in  estimating  i/  and  [1/(H  + D)] 
from  the  combined  data. 

Remark  1.3  : The  estimator  6f^  is  Findley’s  modification  of  the  James-Stein 
estimator.  Here,  instead  of  shrinkage  towards  a specific  point,  we  shrink  towards 
an  overall  average.  Casella  (1985)  goes  on  to  explain  the  relative  impact  of  the 


quantity  T = 


J2j0^j  ~ YY /{m  — 1)D  on  Of^’s,  by  re-writing  as 


9f^  = 


(m  — 3)/(m  — 1) 


T~1y 


1 — ((m  — 3)/{m  — 1))T  ^ 


Y. 


(d)  More  General  Model  : HD  priors  on  9j  with  non-zero  mean  and  DjS  are 
unequal  but  all  known 

Set-up  : Assume  Yi  N[9i,  Di);  Di's  all  known  but  unequal; 
z = 1,  2, . . . , m (>  2) 

9i  N{i>,  A);  i = 1,2, . . . ,m-,  u and  A both  unknown. 

Then  for  each  i, 

(d.l)  9,\Y  N[{AY  + D,i^)/[A  + A),  AA/(A  + A)] 

(d.2)  Therefore,  9f  = [AY^  + Diu)/iA  + Di)  provided  both  ly  and  A are  known. 
(d.3)  Marginally,  Y^  N{iy,  A + Di)-,  i = 1,2, . . . ,m 

(d.4)  Therefore,  Yli  [{Yi  — vY  j{A  -|-  A)]  ~ Am-  However,  it  does  not  help  this  time 
for  “estimating”  A since  v is  also  assumed  to  be  unknown. 


11 


(d.5)  Define  {Yj/{A  + Dj))  / + D^))  . 


From  (d.3),  it  follows  that  v*  is  an  unbiased  estimator  for  v based  on  the  y^’s, 
provided  that  A is  again  known. 

(d.6)  It  also  follows  from  standard  distribution  theory  that  the  pivotal  quantity  u* 
is  normally  distributed  [and  that  it  is,  in  fact,  the  best  unbiased  estimate  for 
V,  provided  A is  known]  and  further  that  Dj)  is  distributed 

as  Xm-i-  This  follows  by 

(i)  first  making  a transformation  from  Yi  to  Zi  defined  as  Zj  = Yij^ A + Dj; 
i = 1, 2, . . . , m and 

(ii)  then  making  an  orthogonal  transformation  from  Zj’s  to  f/j’s  where 


and  (iii)  then  noting  that  are  iid  N(0, 1)  and,  finally,  observing 

that  (iv) 


which  is  Xm-v 

(d.7)  So,  equating  observed  x^  fo  its  expected  value,  we  obtain  an  equation: 


j 


(d.8)  We  also  have,  from  (d.5),  another  equation  involving  A and  u*  : 


We  recursively  solve  these  two  equations  for  ly*  and  A.  Towards  obtaining  an 
initial  value  for  v*,  we  can  be  guided  by  considerations  similar  to  those  laid 


12 


down  above  in  (b.5).  Thus,  for  example,  an  initial  value  of  n*  may  be  taken 
as  V and  then  A can  be  solved  by  referring  to  Dj’s  as  Dmedian- 
(d.9)  Finally,  from  (d.2), 


ef^ 


(Ay,  + D,i)*)/{A  + A) 


Here,  again,  there  is  a sense  of  borrowing  strength  in  estimating  n*  and  A from  the 
combined  data. 


(e)  Regression  Model  : Homoscedastic  Case 


Set-up  : Assume  Yi  N{6i,D)]  D known 

6i  N{xJ b,A)-,  i = 1,2, ...  ,m,  where  each  Xi  is  a p-cornponent  vector. 


Then  for  each  i = 1,  2, . . . , m 

(e.l)  9i\Yi  N[{AYi  -f  DxJb)/{A  + D),AD/{A  + D)]. 

(e.2)  Therefore,  = (AYi  -|-  DxJb)/{A  + D)  provided  both  b and  A are  known. 
(e.3)  Marginally,  Yi  N{xjb,  A + D)-,  i = 1,2, . . . ,m. 

(e.4)  Let  = (a:i^cc2,  ...,*„),  and  assume  rank(X)  = p.  Then,  6 = 

and  ~ xfb)^]/{A  + D)  ^ xh-p  if  b has  p com- 


ponents. 

(e.5)  Therefore,  as  in  (a.6),  E {m  — p — 2)/  - xjhy 

provided  m > {p  + 2). 

(e.6)  Now  we  re-write  in  (e.2)  as 


1/{A  + D), 


df  = [1-D/(A  + D)]Y,  + IB/(A  + D)]x[b 
= xfb+[l-n/(A  + D)](Yi-x[b) 


and  form  EB  estimates  of  for  each  i,  as  indicated  below. 
(e.7)  Thus,  when  A is  known, 

(i)  = y.*  + [1  _ n/(A  + D)]  (Yi  - Yi*)  where  = xjb 


13 


and  when  A is  unknown,  (ii) 


ef^  = Y:  + [l-{m-p-2)D/S\{Yi-Y:) 

= [{m-p-  2)D/S]  Y*  + [I- {m-p-  2)D/S]  Y, 

where  Y*  = xfb  and  S = 

Therefore,  the  EB  estimator  is  a weighted  average  of  sample  estimate  and  the 
regression  estimate. 

(f)  Regression  Model  : Heteroscedastic  Case 


Set-up  : Assume  Yi 
possibly  different. 

9i  N{xJb,A)]  i = 1,2, . . . ,m. 


N{9i,  Di)]  Di  known  for  each  i (=  1,  2, . . . , m)  but  are 


Then  for  each  i 

(f.l)  9,\Y,  N [(AT,  + D,xfb)l{A  + A),  AD,/{A  + A)] 

(f.2)  Therefore,  = (AKj  -k  D^xfb)/{A  + A)  provided  both  b and  A are  known 
(f.3)  Marginally,  Yi  N{xjb,  A + A);  * = 1, 2, . . . , m. 

(f.4)  Therefore,  b = {X'V-^X)-^X'V~^Y 

where  V = Diag((A  -k  A),  (A  + A),  ■ ■ • , + -Dm))- 

(f.5)  Hence,  9f^  = {AYi  + Dixjb) /{A  + Di)  provided  A is  known. 

(f.6)  To  tackle  the  case  of  A unknown,  we  note  that,  formally. 


Ei  (>i  - 6)V(A  + A 


Xm-p-  However,  this  time  we  are  not  in  a 


position  to  unbiasedly  estimate  [1/(A  -k  Di)]  for  each  i. 

(f.7)  We  form  an  estimating  equation  by  equating  the  expression  above  to  its 

expectation  i.e.,  (Tj  — xJb)‘^/{A  + Di)  = [m  — p)  and  solve  for  unique 
positive  A satisfying  both  the  above  equation  and  that  for  b in  (f.4).  Call  the 
resulting  estimator  for  A as  A*.  We  take  A*  = 0 if  no  positive  solution  exists. 


14 


(f.8)  Thus,  finally,  the  estimator  assumes  the  form  ; 

= [A* /{A*  + A)]  T,  + [A/(^*  + A)]  i"; 

where  Y*  = xjh 

Once  more  it  is  a weighted  average  of  the  sample  and  (weighted)  regression 
estimate. 

1.6  Discussion  of  the  Work  of  Fay  and  Herriot 

In  the  following,  we  will  discuss  the  contents  of  the  article  [listed  in  (b)  of 
Section  1.3]  by  Fay  and  Herriot  (1979)  at  some  length.  They  recommended  the 
use  of  a James-Stein  Estimator  in  SAE.  In  effect,  the  final  form  of  the  estimator 
suggested  by  them  identifies  itself  with  an  EB  estimator. 

The  article  refers  to  the  US  Population  Census  1970  and  Estimation  of  Per 
Capita  Income  [PCI]  in  the  current  year  after  the  census  for  Local  Jurisdictions 
with  population  sizes  of  at  the  most  500.  The  general  formula  for  estimation  of 
PCI  in  the  current  year  for  any  place  is  given  by  the  Census  Value  of  PCI  for  that 
place  multiplied  by  a Factor  which  is  the  Ratio  of  Administrative  Estimate  of  PCI 
for  that  place  in  the  current  year  to  the  Derived  Estimate  for  PCI  in  1969.  Thus 
1970  Census  Figures  serve  as  the  Foundation  for  PCI  Estimates.  The  following 
drawbacks  were  noticed  in  the  above  computations  ; 

(i)  Income  data  was  collected  on  the  basis  of  20  % sampling  in  1970  census; 
indeed  this  resulted  in  small  coverage  for  small  places,  thereby  increasing  the 
standard  errors  of  the  resulting  direct  estimates; 

(ii)  Adjusted  formula  used  County  Averages  of  PCI  based  on  1970  census  data 
in  place  of  that  of  the  specific  small  places;  these  county  averages  failed  to 
represent  adequately  the  true  values  for  these  small  places; 


15 


(iii)  Some  auxiliary  available  data  related  to  PCI  from  IRS  and  1970  Census  were 
not  used. 

The  work  by  Fay  and  Herriot  (1979)  centered  around  the  following  key  steps  : 

(a)  Fitting  a regression  equation  to  1970  census  PCI  estimates  for  small  places, 
using  as  independent  variables  county  values  of  PCI,  tax-return  data  for  1969 
and  data  on  housing  from  1970  census; 

(b)  Forming  a weighted  average  of  the  sample  estimate  and  the  regression 
estimate  for  each  small  place,  adjusting  the  weights  to  reflect  the  relative 
magnitudes  of  the  average  lack  of  fit  of  the  regression  and  the  variance  of  the 
sample  estimate; 

(c)  Considering  each  such  weighted  average  to  be  within  one  standard  error  of 
the  sample  estimate,  thus  preventing  severe  disagreement  between  the  sample 
estimate  and  the  final  estimate. 

Following  Fay  and  Herriot,  we  start  with  the  model  : 

Yi\0i  ^ N{9i,Di)]  6i  N{xJb,A)]  z = l,2,  ...,m  (1.1) 

where  DiS  and  ccj’s  are  all  known  but  A is  unknown.  This  is  the  model  we 
have  discussed  earlier  in  section  1.5(/). 

As  pointed  out  by  Fay  and  Herriot,  the  EB  estimators  [i9f^’s]  are,  in  effect,  the 
James-Stein  Estimators,  with  the  unknown  population  parameters  estimated  from 
the  data.  In  the  following,  we  present  the  logical  developments  of  the  formulae, 
following  closely  the  derivations  in  Section  1.5. 

(a)  EB  estimators  : Special  Case  of  the  Model  in  (1.1) 

= [l-{m-p-  2)D/S]  Yi  + [{m-p-  2)D/S]  Y*  (1.2) 

where  A is  known  and  the  Dj’s  are  equal  and  the  common  value  D is  also  known 
(i)  9f^  = EB  estimate  for  small  place; 


16 


(ii)  m = number  of  small  places  ; 

(ill)  p = number  of  auxiliary  variables  in  the  regression  model  [y,  xjb,  {A  + D)I]\ 

(iv)  Yi  = Reported  value  of  Y for  small  place;  Y*  = Fitted  value  of  Y for 
small  place; 

(v)  s = Y.(x^-y:?■ 

Expression  (1.2)  is  a version  of  {e.l){ii)  in  Section  1.5(e). 

(b)  EB  Estimators  : General  Case  of  the  Model  in  (1.1) 


A* 

A*  + Di 


Y.+ 


A 

A*  + D, 


y; 


(1.3) 


where 

(i)  6f^  = EB  estimate  for  small  place; 

(ii)  Yi  =Reported  value  of  Y for  small  place;  Y^  = Fitted  value  of  Y for 
small  place  based  on  the  model  \Y,  Xb,  V]; 

(hi)  m =number  of  small  places; 

(iv)  p =number  of  auxiliary  variables  in  the  regression  model  \Y,Xb,  V]-, 

(v)  V = Diag.  [A  + A]  where  A =known  variance  of  Yp,  A*  = estimate  of  the 
unknown  variance  A of  the  true  means  in  the  prior  distributions,  determined 
from  the  equation  : Y'{V~^  — Px)Y  = m — p where  m > p; 

(vi)  Px  = V-^X[X'V-^X]-^X'V~\ 

In  the  above,  computation  of  A*  is  made  by  trial  and  error,  starting  with 
an  initial  choice  which  is  used  to  compute  V and  hence  Px  and  Then  the 
recursive  procedure  is  followed.  (Remark  1.4  discusses  this  aspect). 

(c)  Final  Results  on  EB  estimators  (as  suggested  by  Fay  and  Herriot): 

6f^  is  finally  given  by  : 

(i)  if  y,  - 7A  < Bf^  < y.  + VA; 

(ii)  y*  - VA  if  <Yi-^p 
(hi)  y,  + AA  if  Bf^  + 


17 


where  Of^  is  defined  in  (6). 

Remark  1.4  : With  regard  to  the  initial  choice  of  A*  and  methods  for  solving  for 
it,  several  different  approaches  have  been  suggested  by  different  people.  Prasad 
and  Rao  (1990)  suggested  starting  with  the  unweighted  least  squares  approach 
to  estimate  the  regression  coefficients  and  then  using  the  method  of  moments  to 
estimate  A.  On  the  other  hand,  Datta  and  Lahiri  (2000)  have  proposed  maximum 
likelihood  and  residual  maximum  likelihood  to  estimate  b and  A.  We  do  not  discuss 
the  details  here. 

1.7  Considerations  for  the  Calculation  of  Bayes  Risk  of  the  EB  estimator 

So  far,  we  have  dealt  with  EB  estimators  and  methods  of  estimation.  Now,  we 
will  discuss  the  issues  involved  in  the  calculation  of  the  mean  squared  error  (same 
as  the  “Bayes  Risk”)  of  these  EB  estimators.  Following  Prasad  and  Rao(1990),  we 
consider  the  model  §i\9i  N{9i,Vi),  9i  N{xfb,A). 

Writing  i9f  = (1  - Bi)§i  + Bixfb,  where  Rj  = 

ef^{A)  = (1  - Bi)9i  + Bixfb{A),  where  b{A)  = {X^ D~^ X)~^ X'^ D~^Y , D = 
Diag(A  + Vi, . . . , A + Vp) 
we  get 

9f^{A)  = 9f^  = (1  - B,%  + B,xJb{A). 

Here  b{A)  and  A are  solved  iteratively  using  weighted  least  squares  approach  for 
b{A)  and  ML  or  REML,  or  least  squares  method  for  A.  The  key  point  is  that 

B EB 

A should  be  estimated  in  such  a way  that  the  two  components,  9i  — 9i  {A) 

/^  EB  EB  ^ 

and  9i  (A)  — 9i  (A)  should  be  mutually  orthogonal.  Also,  9i  — is  always 

B EB  EB  ''  EB 

orthogonal  to  both  (A)  and  9i  (A)  — 9i  (A). 


18 


Thus, 


E[0,  - ef^f  = E{9i  - Off  + E(0f  - ef^{A)f  + E{6f^{A)  - 6f^{A)f 
= K(1  - Bi)  + B'^xJ{X^D-^X)-^Xi 

+ E(^f^(T)-0f^(i))',  (1.4) 


where  D = Diag(Vi  + A,. . . ,Vm  + A)  and  5,  = Vi/ {Vi  + a^zf)  {i  = 1,2, . . . ,m). 

There  is  a nice  interpretation  of  this  formula.  The  first  term  represents  the 
Bayes  risk  (or  the  MSE)  due  to  the  subjective  prior.  The  second  term  represents 
the  excess  MSE  due  to  unknown  b.  The  third  term  represents  the  excess  MSE  due 
to  unknown  A. 

Prasad  and  Rao  (1990)  estimate  Vj(l-5j)  by  Vj(l  — S*)  and  B/xJ{X'^D^^X)~^Xi 
by  Bi  xf{X^D  X)~^Xi  where  D = DiagiVi  + A,. ..  ,Vm  + A).  Finally, 


EB 


EB  . 


EB  , 


they  approximated  E 9i  (A)  - 9i  (A)  by  E {A-A)-^9i  (A) 


1 2 


and 


the  latter  was  further  approximated  by  Bf{Vi  + A)  ^Var(A).  Also,  Var(A) 


was  estimated  by  2A?7i  ^ X]™  i(l  — 5,)^-  Thus,  E{9i  — 9, 


EB, 


is  estimated  by 


V{1  - B,)  + B^xJiX^D  'X)-^x,  + 2B-{V  + A)-^Am-‘^  ~ 


1.8  HB  Estimators  and  Corresponding  Bayes  Risks 


An  alternative  approach  to  this  general  small  area  estimation  problem  is  a 
hierarchical  Bayesian  approach  which  models  the  prior  distribution  in  stages.  The 
model  is  given  by 

(I)  §i\9i,  6,  A ~ ind  N{9i,  Vj),  z = 1,  2, . . . ,m; 

(II)  9i\h,  A ~ ind  N{xfb,  A)  , i = 1,  2, . . . , m; 

(III)  7t(6,A)  oc  1. 

The  goal  is  to  find  the  posterior  distribution  of  0 given  9 since  under  squared  error 
loss,  the  posterior  mean  will  give  the  Bayes  Estimator  and  the  posterior  variance 
will  give  the  Bayes  Risk. 


19 


A simple  way  to  accomplish  this  is  to  use  the  Markov  chain  Monte  Carlo 
(MCMC)  (Gelfand  and  Smith,  1990;  Casella  et  al,  1992;  Casella  et  al,  1999). 
However,  in  this  case,  one  can  proceed  analytically  up  to  a point  so  that  all  one 
needs  is  one-dimensional  numerical  integration. 

Datta  et  al  (1996)  adopted  the  second  approach.  They  began  with  the  joint 
posterior 


7t{6,  b,  A\6)  oc  exp 


- Oif/V, 


i=l 


exp 


2A 


i=l 


(1.5) 


First  integrating  with  respect  to  b and  writing  Px  = X{X^X)-^X^,  ^ = 

Diag{Vi,V2,...,Vm), 


'k{6 , A\6)  (X  {A)  2 exp 


Px)s 


With  the  usual  square  completion  device,  it  can  be  shown  that 


6>|A,0~iV„(Q^-i0,Q) 


(1.6) 


where  Q = A [(/„  - Do)  + {Im  - Do)PxiIm  ~ -Do)]  and  Do  = Diag{l  - 5i,  1 - 

i?2,  . . . , 1 — Bjn). 


On  simplification,  QV  ^0  = 
where 


(1  - Bi)§i  + Bixjh, . . . , (1  - Bm)0m  + Bmxl^b 


b = 


5^(1  - Bi)xixl 


i=l 


-1 


^(1  - B^)x,§i 


,i=l 


Also,  after  much  simplification 


7T 


A|0)  oc  (A)5Pj](A  + H,)"^ 


i=l 


E(i  - 


i XiX, 


i=l 


xexp 


2A 


2A 


^ m 


i=l 
' m 


-1 


^{1  - Bi)xi§i  ^(1  - BiY. 


^(1  - Bi)xA 


.i=l 


, i=\ 


xexp 


20 


Hence, 


Posterior  Mean  = E ( ft  1 0 | = 


1-E(5,|0) 


ft  + E ( Bixjb\e 


and 


Posterior  Variance  = var  ( ft|0  ) = E var(ft|H,  6)\6 


+ var 


E{9,\A,e)\9 


which  simplifies  to  E Vi(l  — Bi)\6 


+E 


+var 


Sftft-xfh)|0 


BfxJ{X^D-^X)-^x,\e 
One  can  now  see  that  the  first  two  terms  in  the  Prasad-Rao  MSE  approx- 
imation are  direct  estimates  of  the  first  two  terms  of  the  posterior  variance  by 
substituting  Bi  for  5j.  The  third  term  in  the  Prasad-Rao  MSE  approximation  is  an 
estimate  of  the  third  term  of  the  posterior  variance. 


1.9  Normality-based  Prediction 

In  this  section,  we  will  discuss  prediction  formulae  in  multivariate  normal 
setup. 


1.9.1  Prediction  in  Multivariate  Normal  Population 

Under  a multivariate  normal  set-up,  if  we  have  observations  on  a subset  of  the 
variables,  then  the  best  point-wise  prediction  for  the  rest  of  the  variables  comes 
from  the  concept  of  multivariate  regression.  Hence,  using  standard  notations, 
if  (X,  Y)  denotes  & {p  + g)-variate  normal  variable  with  component-wise  mean 
vector  {fix,  fly)  and  dispersion  matrix  having  components  Exx,'^xy,'^yy,  then  the 
best  point- wise  prediction  for  Y,  given  X = x,  is  given  by  py  + EyxT,~^{-x  — px)- 
Hence  the  best  predictor  for  the  mean  of  the  Y-components  is  given  by  q~^l^[py  -f 

Since  we  can  interpret  the  predictor  also  as  Bayes  estimator,  the  above  is 
sometimes  referred  to  as  Bayes  Predictor  for  the  entire  population  mean  and  is 
denoted  by  . 

These  results  in  the  multivariate  normal  set-up  form  a basis  for  understanding 
some  preliminary  results  in  the  discussions  and  chapters  that  follow. 


21 


1.9.2  Bayes  Prediction  of  Domain  Means  in  Finite  Population  Sampling 

Suppose  there  are  m domains,  labelled  as  1,  2,  ...,m  and  let  Ni  and  n*  denote 
respectively  the  sizes  of  the  population  domain  and  a sample  drawn  from 
the  same.  We  denote  by  the  “response”  of  the  unit  in  the  domain, 

{j  = 1,  2, ...,  Ni]  i = 1,2, ...,  m).  We  set  n = ^ Uj  so  that  n refers  to  the  total 
sample  size  from  all  the  domains  combined.  We  assume  n*  > 1 for  each  i.  Without 
loss  of  generality,  we  denote  the  sampled  units  from  the  domain  by  the  labels 
(1,  2, ...,  Hi)]  i — 1,  2, ...,  m.  Our  objective  is  to  estimate,  or  more  appropriately, 
predict  the  finite  population  domain  means 

on  the  basis  of  the  sample  observations  {yij\j  = 1, 2,  i = 1,2, ...,  rn. 

In  order  to  construct  a Bayes  predictor  for  the  domain  mean,  we 

postulate  a super-population  model  for  the  Ni  observations  constituting  the  vector 
Yi  in  the  domain.  This  is  taken  as  A^j-variate  normal  with  mean  vector  /Xj  and 
dispersion  matrix  Ej.  Since  our  data  relates  to  the  first  set  of  rii  components  of 
Yi,  to  be  denoted  by  we  will  have  obvious  decomposition  of  the  corresponding 
mean  vector  /Xj  and  the  dispersion  matrix  Sj.  Borrowing  the  results  of  the  previous 
section,  we  can  now  readily  write  down  the  expression  for  the  best  predictor  of  the 
domain  population  mean,  to  be  denoted  by  yf  ,as 

1?  = + Ef=„.+1  E{yij\yf\ 

Note  that  in  the  above,  for  each  j = -b  1,  zXj  + 2, ...,  A^j,  the  set  of  (rz^  + 1)- 
variables  {yi,y2,  ■■■,yni',yj)  follow  (n  + l)-variate  normal  distribution.  Hence  the 
above  computations  are  fairly  routine.  In  applications,  however,  we  impose  certain 
structures  on  /Xj’s  and  Ej’s  and  the  resulting  expressions  are  greatly  simplified. 


22 


1.9.3  EB  Prediction  of  Domain  Means  in  Finite  Population  Sampling 

We  refer  to  the  set-up  in  Section  1.9.2.  When  all  the  population  parameters 
are  known,  we  end  up  with  Bayes  estimators  or  predictors  given  by  . However, 
in  applications,  the  parameters  are  not  all  known  and  are  to  be  estimated  from  the 
given  data.  Once  this  is  done  and  plugged  in  the  expressions  for  the  predictors,  the 
resulting  predictors  are  known  as  EB  predictors  and  are  denoted  by 

Ghosh  and  Meeden  (1986)  in  particular  introduced  an  EB  approach  in  model- 
based  finite  population  sampling  theory.  They  consider  the  model 


yij  = 9i  + eij  (j  = 1,2, . . . , W;  i = 1,2, . . . ,m),  (1.7) 

where  the  6i  and  the  are  mutually  independent  with  9i  N(/r,  r^)  and  Cy  N 

(0,^2). 

Writing  Bi  = ^^/((t^  -p  n^r^),  = (y^j,  y^2,  • • • , Vimf  and  yi  = n~^  = 

1, 2, . . . , m),  the  Bayes  estimator  of  the  stratum  mean  y*  = Vij  under 

squared  error  loss  is  given  by 


Ni 


= E 


i=i 


= NI 


N, 


j=l  j=rii  + l 


= [Uiyi  -P  {Ni  - Hi)  ((1  - Bi)yi  + Biy)] 


= [I  ~ fiB,)yi  + fiBiy, 


(1.8) 


where  fi  = {Ni  - rii)/Ni  (i=l,2,. . . ,m).  If  ^ « 0,  i.e,  fi  « 1,  yf  agrees  with  the 
standard  Bayes  estimator  for  the  normal-normal  model. 

In  the  EB  scenario,  a,  y and  r are  all  unknown  and  need  to  be  estimated  from 
the  marginal  distribution  of  the  y^’s.  Marginally,  {yi, . . . ,ym,  Yh=i  YJjLi{yij  ~ Vif') 


23 


IS 


minimal  sufficient  for  (/r,  cr^,  r^).  Writing  ht  = Yl'ILi  we  estimate  cr^  by 


MSW  = 


riT  — m 
— 1 


the  usual  error  mean  square.  Next  writing  y = rij}  YlT=i  ^iVi 

MSB  = ^tMSi-y)\ 

m — 1 

the  between  mean  square,  it  can  be  shown  that 


E{MSB)  = + ga'^{m  - 1)-^ 


where  g — tit  — /^r-  Also,  Var(MSB)  ^ 0 as  m ^ oo  (while  the  nj’s  are 

fixed).  Hence,  we  estimate  M = t'^ jo'^  consistently  by 


M = max 


/ (m  - 1)MSB 
\{m-3)MSW 


1 ) (m-  1)  ^g  ^ 


(1.9) 


Also,  writing  = (1  + Mrii)  \ we  estimate  Bi  by  Sj  = (1  + Mrii)  ^ and 

M by  /i  = E™  i(l  - Bi)]-'^  - Bi)yi  unless  M = 0 and,  by  m~^  Yn^i  Vi 

when  M = 0.  Then  the  EB  estimator  of  7^  is  given  by 


= {I  - fiBi)yi  + fiBijl,  i = l,2,...,m.  (1.10) 


In  the  special  case  when  rii  = H2  = ■■■  = rim  = n,  M simplifies  to  [(m  — 
l)MSB/{m  - 3)MSW  - l]n~\  Then,  B,  = = B,  say,  for  all  z, 

which  is  the  usual  James-Stein  shrinker,  and  fi  — m~^  Vi-  Then  7'®'®  = 
(qf-®, . . . , 7®®)^  is  the  finite  population  analogue  of  Lindley’s  modification  of 
the  James-Stein  estimator,  and  indeed,  is  very  close  to  the  same  when  the  finite 
population  correction  / is  approximately  1. 

The  moment  we  seek  a replacement  of  the  Bayes  predictors  by  the  EB  pre- 
dictors, we  are  incurring  higher  Bayes  risks.  Therefore,  we  need  to  verify  certain 
desirable  properties  of  the  EB  predictors  so  computed.  It  is  in  this  context  that 


24 


researchers  have  introduced  and  discussed  the  concept  of  optimality.  We  say  that 
an  EB  predictor  is  first  order  asymptotically  optimal  if 

i I 

tends  to  0 as  m tends  to  oo.  In  other  words,  the  average  risk  [taken  over  all  the 
domains]  when  the  EB  predictors  are  used  tends  to  that  of  the  Bayes  predictors 
when  the  number  of  domains  tends  to  be  infinitely  large.  This  presupposes  that 
in  each  domain  the  number  of  data  points  is  still  finite.  In  the  literature  on  EB 
predictors  in  the  context  of  small  areas  estimation,  this  has  been  an  important 
focus  of  research.  Ghosh  and  Meeden  proved  this  property  for  their  EB  estimator. 

1.10  Use  of  Variance  Components  Models  in  SAE 

We  will  now  discuss  the  contents  of  the  article  by  BHF  [listed  in  (c)  of 
Section  1.3].  In  this  article,  these  authors  examined  the  applicability  of  a variance 
components  model  in  the  context  of  SAE. 

Data  on  areas  under  corn  and  soybeans  refer  to  12  counties  but  the  samples 
actually  cover  only  a handful  of  segments  pixels  within  each  county.  There  are 
two  sources:  Survey  Data  [in  terms  of  the  number  of  segments  under  each  type  of 
crop,  as  reported  by  farmers]  and  Satellite  Data  [again  in  terms  of  number  of  pixels 
under  each  crop].  Also  available  are  Total  Area  covered  by  all  the  pixels  in  each  of 
the  counties  from  Satellite  Data. 

The  problem  refers  to  prediction  of  Total  Area  under  Corn  / Soybeans  for  all  the 
counties  as  a whole  [or  on  a county- by-county  basis],  each  county  comprising  of  a 
large  number  of  segments. 

BHF  used  the  Satellite  Data  [on  both  Corn  and  Soybeans]  as  auxiliary 
information  and  stipulated  a model  [for  area  under  corn,  for  example]  in  terms  of 


25 


both  the  auxiliary  variates  in  the  following  way. 


Vij  = bo  + bixiij  + b2X2ij  + Uij  (1.11) 

where 

(i)  Uij  =survey  figure  in  the  segment  of  the  county  [Corn/Soybeans]; 

(ii)  bo,  bi,  62  have  their  usual  significance; 

(iii)  Uij  = over-all  error  decomposable  in  the  form  : Uij  = Vi~\-  tij. 

In  the  above,  Vi  denotes  county-specific  random  error  [with  mean  0 and  variance 
cTy]  while  6ij  represents  the  usual  random  error  [with  mean  0 and  variance  cr^]. 
Further,  the  two  error  components  are  assumed  to  be  independent. 

As  to  the  nature  of  data,  we  have  available 
(^)  [yijj^iijjX2ij\l  < j < ni]l  < i < k]]  k = number  of  counties;  n*  = number  of 
sample  segments  from  county; 

(b)  County  Totals  [T{xu),T{x2i)]  for  both  the  auxiliary  variables  for  all  seg- 
ments; Aj=population  number  of  segments  in  the  county. 

We  will  denote  by  Xii  = T{xu)/Ni  the  population  mean  of  x\  in  the  county. 
Similarly,  are  defined  and  refer  to  the  county-wise  population  means  for  the 
second  auxiliary  variable.  It  is  tacitly  assumed  that  Ni's  are  known  in  advance. 
Lastly,  we  will  denote  the  county  sample  mean  of  y as  y*. 

1.10.1  Prediction  in  Variance  Components  Model 

Prediction  formula  is  concerned  with  the  population  mean  of  Y per  segment  in 
the  county,  to  be  denoted  by  7j,  which  is  defined  as  the  conditional  mean  of  y 
per  segment,  given  the  realized  county  effect  [uj]  and  the  values  of  x\  and  X2  for  all 
segments  in  the  county.  Thus  we  may  write 


li  = bo  + fcixfj  -H  b2X2i  + Vi. 


(1.12) 


26 


Therefore,  prediction  of  'y^  for  each  i amounts  to  (a)  estimation  of  fixed  effects 
parameters  6’s  and  (b)  prediction  of  the  random  county-specific  error  Vi.  However, 
if  the  variance  components  were  known,  it  would  be  a routine  task  to  estimate  the 
b coefficients  by  applying  the  technique  of  Weighted  Least  Squares.  So  we  first 
discuss  below  the  techniques  for  estimation  of  variance  components. 

1.10.2  Estimation  of  cr^ 

It  follows  from  the  model  that  Wij  = has  the  model  representa- 
tion: - Xii)  + b2{x2ij  - X2i)  + {eij  - Ei)  where  e*  = Xu  and 

X2i  refer  to  the  sample  means  of  x\  and  X2  respectively  in  teh  domain.  It  is 
not  difficult  to  argue  that  the  “Residual  Sum  of  Squares  [RSS]”  under  this  revised 
model  in  terms  of  the  Wij's  is  distributed  as  alx^  with  df  n*  = — 1)  — 2.  In 

other  words,  RSS/a1  is  distributed  as  Thus  estimation  of  is  taken  care  of. 

1.10.3  Estimation  of  aj, 

Towards  this,  BHF  applied  Henderson’s  Technique.  We  consider  a simplified 
version  of  the  model  wherein  the  errors  are  homogeneous  for  the  time  being.  Thus 
h coefficients  are  estimated  by  using  the  Ordinaly  Least  Squares  [OLS]  technique. 
Next  we  compute  the  mean  of  the  residuals  of  the  sample  observations  in  the 
county,  denoted  by  Uj.  It  has  mean  0 and  we  now  work  out  its  variance  [i.e., 
E{uf)]  under  the  original  model.  Clearly  this  will  turn  out  to  be  a linear  function 
of  (Tg  and  a^.  Call  it  -t-  dia^.  Next  we  compute  a weighted  sum  of  uf  as 
m..  = J^riiuf/Y^nigi  so  that  E{;m„)  = + ca^.  Here  c = Y^nigi. 

Therefore,  can  be  estimated  as  d1  = maa:[0,rra..  — cdg]. 

Remark  1.5  : Having  estimated  the  variance  components,  we  are  now  in  a 
position  to  estimate  the  h coefficients  in  a routine  manner  by  weighted  least  squares 
technique. 


27 


1.10.4  Prediction  on  Vj 

First  note  that  we  have  already  estimated  the  variance  components  and  the 
b coefficients.  These  estimates  may  be  suitably  used  to  predict  Uj  values  for  each 
county.  Next  note  that  for  the  county,  the  average  of  the  residuals  Ui  and  Vi 
have  a bivariate  normal  distribution  with  means  0,0,  variances  cr^  + crl/ui,  al 
and  correlation  coefficient  p where  = al/{al  + al/ui).  Hence  the  conditional 
mean  of  Uj,  given  Ui  is  given  by  UipOy/aui  which,  in  its  turn,  simplifies  to  pjfZj 
where  pi  = al/rrii,  rrii  being  equal  to  + aHrii.  This  conditional  mean  provides 
a predictor  for  Vi  for  each  county  after  replacing  Ui  by  its  estimate  obtainable  by 
use  of  the  estimates  of  h coefficients  and,  further,  using  estimates  of  the  variance 
components.  We  denote  the  predictor  so  derived  by  Dj. 

1.10.5  Prediction  on 

Finally,  the  predictor  for  the  T'*  county  mean  7j  is  given  by 

^i  = bQ-^hlX^^  + h2X2i  + Vi.  (1-13) 

Moreover,  the  variance  of  this  predictor  is  given  by 


(1.14) 


where  Cj  = xf  — piXi  is  vector  of  order  2x1  and  E(6)  is  a matrix  of  order  2x2, 
referring  to  the  dispersion  matrix  of  the  estimates  of  bi  and  62. 

Remark  1.6  : A predictor  for  the  finite  population  mean  of  W segments  is  given 
by 

* **  ** 


K: 


X]  Vij  + ^1  X]  Xu  + ^2  X]  + {Ni  - ni)vi 


/Ni 


(1.15) 


where  refers  to  the  sum  over  observed  segments  and  ^2**  I'efers  to  the  sum 
over  Ni  — rii  unobserved  segments. 


28 


Remark  1.7  : Datta  and  Ghosh  (1991)  developed  the  theory  for  prediction  of  the 
mean  of  unobserved  population  units  of  the  study  variable,  based  on  the  variance 
components  model  of  BHF.  In  what  follows,  we  briefly  present  their  results. 

Recall  the  BHF  Model  which  can  be  rephrased  as  : 

C OTiditioTicil  on  ^2  5 ^ oticI  A,  yij  — 6q  -f-  biXuj  b2X2ij  (1.16) 

where  the  error  components  are  independent  normal  and  V{eij)  = o\  = \jr  and 
V{vi)  = (t1  = 1/rX.  At  this  stage,  Datta  and  Ghosh  assumed  the  following  priors  ; 

(i)  6’s  have  uniform  prior  distribution  over 

(ii)  r has  gamma  distribution  with  parameters  (ofo/2,  5'o/2); 

(in)  rX  has  gamma  distribution  with  parameters  (ai/2,  gfi/2), 

with  all  the  components  above  independently  distributed.  Then  the  joint  predictive 
distribution  of  the  unobserved  y^-’s  is  multivariate  t with 

(a)  df  = n + go  + gi  — p where  n = number  of  available  observations  across  all 
counties  combined  and  p = number  of  b coefficients  in  the  model; 

(b)  location  parameter  vector  Myo  of  order  (A'  — n)  x 1,  M being  a matrix  of 
order  (A  — n)  x (A  — n)  and  yo  being  the  observation- vector  of  order  n x 1; 

(c)  scale  parameter  [n  + go  + gi  — p]“^[ao  + criA  -1-  y^KyolG  which  is  a matrix  of 
order  {N  — n)  x {N  — n),  K being  a symmetric  matrix  of  order  n x n and  G 
being  a symmetric  matrix  of  order  (A  — n)  x (A  — n). 

It  may  be  noted  that  the  specific  forms  of  the  matrices  K and  G depend  on 

(i)  Partitioning  of  the  model  for  observable  and  unobservable  as  Yo  = Xob 
and  Yuo  — Xuob  where  Xo  = [A01IA02]  and  Xuq  = [A[/oi|A(/o2]; 

(ii)  Partitioning  of  the  Variance-Covariance  matrix  of  all  A y^’s  with  the 
components  denoted  by  En,  E12,  S22  in  usual  notations. 

Note  that  in  the  above  joint  distribution,  A is  involved.  Datta  and  Ghosh  de- 
rived the  form  of  the  conditional  distribution  of  A,  given  yo.  This  depends  on 


29 


[yo)  ^0,  Sii,  ao)  cKi,  yo  and  5i],  They  also  discussed  the  computational  tech- 
niques. Once  the  predictive  distributions  are  known,  prediction  of  yuo  across  all 
counties  poses  no  difficulty. 

1.11  Measurement  Error  Models 

In  the  following,  we  introduce  measurement  error  models.  Regression  models 
wherein  the  independent  variables  are  measured  with  error  are  called  Measurement 
Error  Regression  Models.  The  classical  linear  regression  model  with  only  one 
covariate  is  given  by 


Yi  = bo  + biXi  + ei\  i = l,2,...,m,  (1T7) 

where  {xi,X2, . . . , Xm)  is  fixed  in  repeated  sampling  and  the  e,  are  N(0,  aj] 
random  variables. 

Measurement  error  model  is  an  extension  of  the  above  model  where  one  is  unable 
to  observe  Xj  directly.  Instead  of  observing  Xj,  one  observes  the  sum 

Xi  = Xi  + Ui  (1-18) 

where  Ui  is  a (0,  a^)  random  variable. 

The  observed  variable  Xi  is  sometimes  called  the  manifest  variable  or  the  indicator 
variable.  The  unobserved  variable  Xi  is  called  a latent  variable.  Models  with  fixed 
Xi  are  called  functional  models  while  models  with  random  Xj  are  called  structural 
models. 

We  consider  both  functional  and  structural  measurement  error  models  in  the 
context  of  SAE  in  the  subsequent  chapters. 

Fuller  (1987)  points  out  the  effect  of  measurement  errors  on  the  standard  least 
squares  coefficient  based  on  (1.17)  and  (1.18).  Assume  that  the  fJCi 


30 


Then, 


T: 


X,; 


iid 


Oq  + Oi/^x 


f^x 


/ 


hal  + al  biul 


hiG, 


(tI  + (^2 


Accordingly,  if  bi  = '^Yi{Xi  — X)  j ~ denotes  the  standard  linear 

regression  coefficient,  E{bi)  = bia\l{a\  + cr^). 

Thus,  the  standard  least  squares  coefficient  is  not  even  a consistent  estimate  of  bi. 
This  fact  also  will  be  revealed  later  in  our  calculations. 


1.12  Discussion  of  the  Work  of  Ghosh  et  al  (1998) 

Ghosh  et  al  (1998)  considered  a hierarchical  Bayesian  approach  towards  small 
area  estimation  based  on  generalized  linear  models.  Suppose  there  are  k local 
areas.  Let  yij  denote  the  response  (discrete  or  continuous)  of  the  unit  in  the 
stratum.  The  are  assumed  to  be  conditionally  independent  with  pdf 

/(yql%)  = + PiVij,  %)],  (L19) 

(j  = 1)  ■ ■ ■ , i = ■ ,m),  where  4>ij{>  0)  are  known.  The  canonical 

parameters  Oij  are  modeled  as 

h{0ij)  = xjfi  + u,  + tij  (A:  = 1, . . . ,nj,  i = 1, . . . , m),  (1.20) 

where  h is  a strictly  increasing  function,  usually  referred  to  as  the  link  function, 
the  Xij{p  X 1)  are  known  as  design  vectors,  and  h{p  x 1)  is  the  unknown  regression 
coefficient,  the  Ui  are  the  random  effects  and  are  the  errors.  It  is  assumed  that 
the  Ui  and  e^-  are  mutually  independent  with  Ui  ^"(0,  cr^)  and  N{0,a'^). 

ft  is  possible  to  represent  1.19  and  1.20  in  a hierarchical  framework.  Let 
0 = (6*11,  • • • , 6*1„1, . . . , 6ml,  • ■ ■ , GmnmY  ^'^d  w = (ui, . . . , Umf  ■ Then  the  hierarchical 


model  is  given  by 


31 


(I)  conditional  on  6,b,u,a^  and  a^.  The  are  independent  with  densities  given 
in  1.19, 

(II)  conditional  on  b,u,al  and  al,  h{6ij)  ^ ^ N{xJjb  + Ui,al). 

(III)  Conditional  on  b,a^  and  a^,  Ui  A^(0,  cr„). 

To  complete  the  hierarchical  model,  one  assigns  the  following  priors  to  b, 
and  (Tg  : 

(IV)  6,  cr„  and  al  are  mutually  independent  with  b ~ uniform{RF)  {p  < m), 
al  ~ IG{\a,  |6)  and  o1  ~ IG{^c,  ^d).  [A  random  variable  Z ~ IG{ai,a2)  if 
Z has  pdf  exp(— a2z)z“^“^/(o,oo)(-z)]- 

The  objective  is  to  find  joint  posterior  distribution  of  y(%)  for  some  strictly 
increasing  function  g.  Of  particular  relevance  is  estimation  of  the  small  area  means 
ip'ij{0ij).  The  Bayesian  procedure  was  implemented  by  applying  the  Markov  chain 
Monte  Carlo  numerical  integration  technique  in  this  general  framework.  The  results 
were  extended  to  the  analysis  of  multicategory  data  as  well  as  some  spatial  data. 

1.13  Salient  Features  and  Layout  of  this  Dissertation 

The  topic  of  this  dissertation  is  to  address  certain  areas  of  current  importance 
in  small  area  estimation.  On  one  hand,  we  propose  to  develop  EB  and  HB  meth- 
ods for  inference  when  certain  covariates  are  measured  with  error  in  the  assumed 
normal  models.  On  the  other  hand,  we  provide  a Bayesian  procedure  for  estimation 
when  the  response  is  binary  in  nature.  Thus,  the  outline  of  the  rest  of  the  disserta- 
tion is  as  follows.  In  Chapter  2,  we  propose  a functional  measurement  error  model 
and  discuss  an  EB  method  of  prediction  of  finite  population  means  in  a Finite 
Population  Sampling  set-up.  This  is  illustrated  with  applications  to  a small  area 
data.  In  Chapter  3,  we  extend  the  previous  model  to  a structural  measurement 
error  model  and  propose  an  EB  and  HB  method  for  prediction  of  finite  population 
means.  Finally  in  Chapter  4,  we  consider  an  EB  and  HB  estimation  procedure  for 


32 


a binary  response.  Chapter  5 offers  some  concluding  remarks  and  scope  for  future 
work. 


CHAPTER  2 

EMPIRICAL  BAYES  ESTIMATION  IN  FINITE  POPULATION 
SAMPLING  UNDER  FUNCTIONAL  MEASUREMENT  ERROR  MODELS 

2.1  Introduction 

As  discussed  in  the  introduction,  EB  methods  are  widely  used  for  simultaneous 
estimation  or  prediction. 

These  methods  are  very  well-suited  in  finite  population  sampling  where  the 
target  is  to  estimate  simultaneously  several  strata  parameters,  for  example,  the 
strata  means.  This  is  especially  so  for  small  area  estimation  where  each  individual 
stratum  often  contains  very  few  observations,  and  direct  estimators  are  usually 
subject  to  large  standard  errors  and  coefficients  of  variation. 

Ghosh  and  Meeden  (1986)  considered  EB  estimation  of  finite  population 
strata  means  using  a model-based  approach.  They  used  a simple  one-way  random 
effects  ANOVA  model  for  this  purpose.  The  results  can  be  extended  by  inclusion  of 
covariates,  and  such  procedures  are  discussed  in  Ghosh  and  Meeden  (1996). 

Often,  however,  it  is  not  possible  to  obtain  exact  measurements  of  these 
covariates.  We  provide  a few  examples  to  illustrate  this. 

I.  This  example  is  taken  from  Fuller,  (1987,  p 2).  Suppose,  we  want  to  predict 
the  yield  of  corn  in  several  counties  in  Iowa  and  the  covariate  used  is  available 
nitrogen  in  the  soil.  To  estimate  the  available  soil  nitrogen,  it  is  necessary  to 
sample  the  soil  of  the  experimental  plot,  and  to  perform  a laboratory  analysis 
on  the  selected  sample.  As  a result  of  the  sampling  and  of  the  laboratory 
analysis,  we  do  not  observe  the  true  available  nitrogen,  but  only  its  estimate. 


33 


34 


II.  Suppose  we  have  several  strata  obtained  after  stratifying  by  gender,  ethnicity, 
age  and  region.  We  take  measures  on  the  blood  pressure  (bp),  body  weight 
(bw),  Body  Mass  Index  (BMI)  and  Waist-Hip  Ratio  (WHR)  for  each  subject 
within  each  stratum  and  we  want  to  model  the  average  bp  in  each  stratum 
as  a function  of  the  other  three  variables.  It  seems  likely  that  bw,  BMI  and 
WHR  would  be  measured  with  error. 

HI.  Suppose  we  are  looking  at  patients  undergoing  a rare  surgical  procedure 
in  different  hospitals.  Since  this  is  a rare  surgery,  the  number  of  observed 
cases  in  each  hospital  will  be  very  small.  Now,  suppose,  we  are  interested 
in  modeling  the  time  to  recovery  by  bp,  heart  rate  (hr),  and  other  such 
measurements.  Then,  again,  it  seems  likely  that  the  measured  covariates  are 
affected  by  some  error. 

IV.  Suppose  we  are  interested  in  estimating  the  volume  of  trees  for  several  areas. 
We  have  data  (for  smaller  sub-areas)  on  volume  of  stem  wood  for  the  current 
year  and  measures  of  the  diameter  and  height  of  stem  wood  from  a previous 
year.  Here,  we  may  model  the  volume  of  trees  by  their  diameter  and  height. 
However,  the  latter  two  are  likely  to  be  measured  with  error. 

All  the  above  instances  seem  to  indicate  why  measurement  error  models  are  so 
suitable  for  simultaneous  estimation  of  strata  parameters. 

In  this  chapter,  we  develop  EB  procedures  for  simultaneous  estimation  of  finite 
population  strata  means  when  the  covariates,  say  x,  are  measured  with  error.  Our 
work  seems  to  be  a natural  extension  of  the  work  of  Ghosh  and  Meeden  (1986, 
1996).  We  are  also  assuming  that  the  unknown  true  covariates  are  non-stochastic. 
In  the  common  terminology  (cf.  Fuller,  1987,  p 2)  this  is  the  so-called  functional 
measurement  error  model.  This  is  in  contrast  with  the  structural  measurement 
error  model  where  the  unobserved  covariates  are  also  treated  as  random.  Structural 
measurement  error  models  will  be  discussed  in  the  next  chapter. 


35 


EB  estimators  (or  more  appropriately  predictors)  for  strata  means  are  de- 
veloped in  Section  2.2,  taking  into  account  the  possibility  that  the  covariates  are 
measured  with  error.  Section  2.3  establishes  the  first  order  “asymptotic  optimality” 
property  of  these  EB  estimators.  A simulation  study  is  conducted  in  Section  2.4  to 
compare  the  performance  of  EB  estimator  to  some  of  the  standard  estimators.  The 
proofs  of  certain  technical  results  are  deferred  to  the  Appendix. 

2.2  EB  Estimators 

Suppose  there  are  m strata  labeled  1,. . . ,m  and  let  Ni  denote  the  known 
population  size  for  the  stratum.  We  denote  by  yij  the  response  of  the  unit 
in  the  stratum  (j  = 1, . . . , Aj;  i = 1, . . . , m).  A sample  of  size  n*  is  drawn 
from  the  stratum.  Without  loss  of  generality,  we  denote  the  sampled  units  by 
1, 2, . . . , nj  (z  = 1, . . . , m).  Our  objective  is  to  estimate  (or  more  appropriately 
predict)  the  finite  population  means  y*  = Vij  (f  = 1,  ■ ■ ■ , m)  on  the  basis 

of  the  sample  yij  (j  = 1, . . . , n*;  z = 1, . . . , m). 

Consider  the  super-population  model  = 9i  + eij,  where  N{0,al).  We 

consider  the  prior  6i  N{bo  + biXi,  al).  Then,  writing  = (yn, . . . yi^,-  ■ ■ , 

2/mi,  ■ • • , ymrin,),  Bi  = tj^/ (Ug  -|-  njCT^),  z = 1, . . . , m,  the  Bayes  predictor  of  y*  is 
given  by 


II 

■ rii  Ni 

Ylyij+ 

-i=l  j=rii  + l 

~ rii  Ni 

= 

^y^J+  B{e,\y) 

j=rii  + l 

= N-' 

^y^J  + {Ni  - ni){(l  - 
-j=i 

Bi)yi  + Bi{bo  + 6iXj)} 

— ~ f%Bi)yi  + fiBi{bo  + biXi),  (2-1) 

where  /,  = {Ni  - rii)/Ni,  z = 1, . . . , m. 


36 


Suppose,  however,  the  Xj  are  not  observable.  What  we  observe  instead  is 
Xij  N{xi,  a^),  j = 1, ...  ,rii,  i = 1, ...  ,m.  We  assume  the  Xij  to  be 
independent  of  (j/y,  9i).  Then  a pseudo-Bayes  estimator  of  7,  is  obtained  by 
replacing  Xj  by  Xi  = n~^  in  (2-1),  i.  e.  we  estimate  7.;  by 


In  an  EB  scenario,  neither  the  regression  coefficients  6q  and  nor  the  vari- 
ance components  a^,  and  cr^  are  known,  and  they  need  to  be  estimated  from 
the  marginal  distributions  of  the  {j  = 1, . . . , n*;  i = 1, . . . , m). 


and  SSWx  = ~~  the  minimal  sufficient  statistics  are 


and  SSWx  ~ cr^XnT-rm  where  rir  = nr).  Simpleminded  initial  estima- 

tors of  61  and  ho  are  given  respectively  by  61  = EIli 

and  bo  = y — biX,  where  X = niXi/riT  and  y — The  following 

theorem  shows  that  61  is  an  inconsistent  estimator  of  61,  and  consequently  60  is  also 
an  inconsistent  estimator  of  bo. 

Theorem  2.2.1  Assume  (i)  maxi<j<mUi  < k < 00  and 

1)  — > c (>  0)  05  m — > 00,  where  x = YllLi  niXi/nx.  Then  Eipi)  — > bic/{a^  + c), 
y {bi)  — *■  0 asm  ^ 00. 

The  proof  of  the  theorem  is  deferred  to  the  Appendix.  The  theorem  says  that  the 
regression  coefficient  bi  converges  in  probability  to  a fraction  multiplier  of  bi.  Thus 
the  regression  coefficient  is  attenuated  by  measurement  error.  The  theorem  also 
implies  that  a consistent  estimator  of  61  is  obtained  when  one  multiplies  bi  by  a 
consistent  estimator  of  + c)/c. 


— (1  ~ fiBi)yi  + fiBi{bo  + biXi). 


(2.2) 


(j/ij  ■ ■ • ) Vm,  SSWy,  Xi, . . . , Xm,  SSWx)  whose  components  are  mutually  inde- 
pendent. Also,  yi  ~ N{bo  + b^Xi,  al  + al/ui),  Xi  ~ N{xi,  ayui),  SSWy  ~ 


37 


To  this  end,  we  observe  that  since  ~ cr'^XnT-mi  writing  MSWx  = 

SSWx/{nT  — m),  E{MSWx)  = cr'^  and  V{MSWx)  = 2uy{nT  — m).  Hence,  if 
— m— >ooasm^oo,  MSWx  ^ cr^-  Next  writing  SSBx  = ~ 

one  can  write  SSBx  = z^{Im  ~ u^u^)zm,  where  z'^  = (^/nlXi, . . . , 

■“m  = {'Jtb/ y/Xr,  • ■ ■ , y/n^/ y/rij),  and  is  the  identity  matrix  of  order  m.  Since 
Zm  has  mean  vector  (yTijXi, . . . , y/n^XmY^  and  variance-covariance  matrix 
by  the  symmetry  and  idempotency  of  SSBx  ~ cr^Xm-ii^m),  where 

m 

i 

Thus,  E{SSBx)  = — 1 -H  ^m),  y(SSBx)  = 2a^{m  — 1 -|-  4^m)  (Searle, 

1971,  p 50).  Now  writing  MSBx  = SSBx/{rn  — 1),  under  the  assumption  of 
Theorem  2.2.1,  E{MSBx)  = cr^[l  + ^ml{m  - 1)]  ^ a^(l  + c/a^)  = + c, 

V{MSBx)  = 2a^{m  — 1)~^[1  -f  4,^rn/("i  — 1)]  — > 0 as  m ^ oo.  Hence,  a consistent 
estimator  of  (cr^  -|-  c)/c  is  given  by  MSBx/{MSBx  — MSWx)-  This  leads  to  the 
estimators  of  bi  and  6q  as 

h = [MSBx/{MSBx  - MSWx)]k, 

bo  = y-biX.  (2.3) 

Remark  2.1  : MSWx/MSBx  is  essentially  the  James-Stein  shrinker  but  for  a 
constant  multiplier.  This  highlights  the  role  of  this  shrinker  in  the  measurement 
error  context  as  well.  This  is  also  noticed  in  Whittemore  (1989)  in  a related  but 
slightly  different  context. 

Next  to  estimate  B^  = all{al  -|-  riial),  one  needs  to  estimate  the  variance 
components  and  cr^.  Writing  MSWy  = SSWy/{riT  — m),  E{MSWy)  = and 
V {M SWy)  = 2u\j{nT  — m)  ^ 0 as  m ^ oo  if  — m ^ oo  as  m ^ oo.  Next  let 

MSBy  — 'n-iiVi  — y)^/('^  — !)■  Then  we  have  the  following  theorem. 


38 


Theorem  2.2.2  Assume  (i)  maxi<i<mni  < K < oo,  (ii)  ni{xi—xY / {;m—l)  — >■  c 

as  m ^ CO  and  {iii)nT  — m — > oo  as  m oo.  Then  E{MSBy)  — (yl  + h\Y  ni{xi  — 
xf/{rn  - 1)  + crlgmUrn  - 1)  {g^  = - YT=i  and  V {MS By)  ^ 0 as 

m — > oo. 

The  proof  of  this  theorem  is  also  deferred  to  the  Appendix.  Since  gm/{'m  - 1)  = 
nl{m  - 1)-^  ^ - l)■^  lim  infm^oo  {m  — l)/gm  = +00.  Hence  a 

consistent  estimator  of  al  is  given  by  {MSBy-MSWy-bj{MSBx-MSWx)}{m- 
^)/9m-  Since  > 0,  we  estimate  the  same  by 

&l  = max{0,{MSBy- MSWy-bl{MSBx  - MSWx)}{m-l)/g^).  (2.4) 

Now  B,  is  estimated  by  Bi  = MSWy/{MSWy  + Uidl).  This  leads  to  the  EB 
estimator 


~ (1  ~ fiBi)yi  + fiBi(bo  + biXi) 


(2.5) 


of  7t  (i  = 1, . . . , m).  It  can  be  shown  that  under  the  conditions  of  the  theorem, 


maxi<i<rn\Bi  — 5j|  0,  bo  ^ bo  and  bi  bi  as  m oo. 

It  follows  from  (2.2)  and  (2.5)  that  7f^  - = fi{B,  - B,)yi  + fi{Bi  - Bi){bo  + 


bat)  + m 


{bo  ~ bo)  + (^1  — bi)Xi 


0 as  7Ti  — > oo  for  every  i.  If  in  addition, 


Hi  ^ oo,  then  yi  7,  and  the  EB  estimator  is  consistent  for  the  finite  population 


mean. 

In  the  next  section,  we  provide  an  expression  for  the  Bayes  risk  of  7'®'®  and 
also  show  its  asymptotic  equivalence  to  7'^'®  in  terms  of  their  Bayes  risks. 


2.3  Bayes  Risks 

We  begin  with  the  derivation  of  the  Bayes  risk  of  the  predictor  7'^'®  = 

{li^-i  ■ ■ • ) Im^)'^  of  7 = (71, ... , 7„i)^,  the  vector  of  population  strata  means.  To 


39 


this  end,  we  begin  with  the  identity 

m mm 


m 


i=l 


i=l 


i=l 


+2E 


m 


2=1 


First,  we  calculate 


(2.6) 


m 


- 7t)^  = rn  + 


i=i 


= m 


i=l 

m 

-E 

i=l 


E{j,  - jtr + E{jf  - jfy 


since  E{-yi  - -yf  |y)  = 0.  Now,  writing  y*  = {Ni  - m)  ^ Ejn.+i 


E{it-nr  = E 


1 2 


(1  - fiBi)yi  + fiBi{bo  + biXi)  - (1  - fi)yi  - Uy* 


f-E 


-|  2 


(1  - Bi){yi  -bo-  biXi)  - {y*  -bo-  biXi) 


= /f 
= f! 


(2.7) 


(1  - Bi)  {ajrii  + aj  + aJ{Ni  - n^)  + < - 2(1  - Bi) 


O',, 


2 (1-5.)  , 

CT.  S H 


rii 


-w^\+Eyu 

N,- ml 


(2.8) 


using  the  facts  that  marginally  E{yi)  = E{y*)  = bo  + feiXj 


V{m)  = E 


vm  = E 


v{m) 


v{m) 


+ 1/ 


+F 


and  Cov{yi,y*)  = E 


E{m) 


E{m) 


E{pljni)  + V{e,)  = allrii  + cr; 


- ^2 


^el{Ei  - Hi)  + cr. 


Cov{yi,y*\6i) 


+ Cov 


E{yi\9i),E{y*\ei) 


= vm  = 


40 


Again, 


Eiii-irr  = E 


1 fi^i  I Vi  { ^0  “t“  biXi 


- 1^1  - fiB,j  yi  - f,B,  + hXi 

= E [fiBMXi  - Xi)Y 


= mByjn,. 


(2.9) 


Hence,  from  (2.7)  - (2.9), 

m 

J2  ~ 


m 


-1 


i=l 


= m 


-'E/i' 


i=l 


(1  - B,Y/n,  + l/(iV,  - n,)  + Bfai  + 


rii 


(2.10) 


We  will  next  show  that  under  certain  conditions  m~^  Z)”  i E{jf^  — — >■  0 as 

m — > oo.  Specifically,  we  state  the  following  theorem. 

Theorem  2.3.1  Assume  mini<j<,„  rij  > 1,  maxi<j<m  n*  = K < oo,  (Hi) 
riT  — m ^ oo,  and  (iv)  X)ili  ^*(^i  “ xY /{m  — 1)  ^ C{>  0)  as  m ^ oo.  Then 


m 


E{lf^  - if^Y  ^ 0 as  m ^ oo. 


i=l 


The  proof  of  the  theorem  is  given  in  the  Appendix. 

In  view  of  (2.10)  and  Theorem  2.3.1  and  the  Schwarz  inequality, 

E\[ii^ - < eb^{jY^ - jiYEB\jf^ - ^f^Y 


< 


<{-  + 


1 


Ui  Ni  - Hi 


+ crl  + h\al/ni 


1/2 


(2.11) 


41 


where  C{>  0)  is  a generic  constant.  Now 


E 

m 

A/EB  _ ;~PB 
n 

< 


Cm  ^ 


i=l 

m 

< C 

*=i 

^ 0 as  m ^ oo. 


1/2 


(2.12) 


The  last  inequality  follows  from  the  fact  that  if  Z is  any  positive  random  variable 
assuming  values  Zi, . . . , Z„,  by  Jensen’s  inequality  m~^  — E(Z^I'^)  < 

{EZy!^  = {m~^  YlT=i  Thus,  from  (2.10)  — (2.12),  we  will  have 

m m 

~ - li?  + o(i), 

i=l  i=l 

that  is,  the  EB  predictor  of  7 has  the  same  first  order  asymptotic  risk  as  the 
pseudo  Bayes  predictor  7'^^  of  7 given  in  (2.7). 

2.4  Simulation  Study 

In  the  following,  we  will  discuss  a simulation  study,  comparing  the  EB  estima- 
tor found  earlier,  to  some  of  the  existing  standard  estimators  such  as  the  sample 
mean  and  the  regular  regression  estimator  without  measurement  error. 

We  create  a population  of  size  1400,  spread  over  12  strata  (ie,  m = 12). 

The  strata  sizes  Ni  are  taken  as  50,  250,  50,  100,  200,  150,  50,  150,  100,  150,  100, 
50.  We  generate  the  population  with  several  values  of  the  variance  components 
and  regression  coefficients.  The  results  were  fairly  insensitive  to  the  values  of 
CTg,  cr^  and  6q.  We  have  thus  reported  the  results  with  al  = 100,  cr^  = 16  and 
60  = 100.  However,  the  results  seem  to  be  sensitive  to  the  values  of  61/cr^.  We 
have  considered  several  pairs  of  values  of  (5i/a^)  and  report  in  this  study  the 
results  with  bi  = 5 with  varying  values  of  cr^.  The  true  Xi  are  taken  as  55,  173, 

165,  182.5,  189,  266,  190,  223.33,  232,  207.33,  227,  218.  Then  the  are  generated 


42 


from  a normal  distribution  with  means  Xj’s  and  variance  There  are  1400  such 
Xjj’s.  Also,  we  generate  the  O-i  from  N{ho  + Finally,  the  yij  are  generated 

from  N{0i,a1).  Thus,  we  have  1400  X^’s  and  r/jj’s,  spanned  over  12  strata.  This 
constitutes  the  population. 

We  draw  a set  of  1000  independent  samples  from  this  population.  For  each 
stratum,  the  sample  size  in  each  case  is  2%  of  the  corresponding  population  size. 
Thus,  the  stratum  sample  sizes  are  1,  5,  1,  2,  4,  3,  1,  3,  2,  3,  2,  1.  For  each  sample 
in  each  stratum,  we  calculate  the  sample  mean,  the  simple  regression  estimate 
(without  measurement  error)  and  the  EB  estimate.  The  final  three  estimates  for 
each  stratum  are  obtained  as  averages  of  these  1,000  estimates. 

We  consider  = 0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.0, 4.5  and  5.0.  For 
each  stratum,  we  calculate  the  squared  deviation  of  each  of  these  three  estimators 
from  the  true  means,  and  report  the  average  (over  1,000  samples)  of  these  squared 
deviations.  In  particular,  we  report  the  results  (referred  to  as  RMSE’s  in  the 
tables,  after  taking  square  root  of  the  MSE’s)  for  cr^  = 0,  2.5, 4.0  and  4.5  to  discuss 
the  salient  features  of  the  estimators  in  this  interval.  The  findings  are  summarized 
below  : 

(i)  We  see  that  when  cr^  is  in  the  interval  [0, 1),  the  simple  regression  estimator 
is  performing  the  best  in  most  of  the  strata,  followed  by  the  EB  estimator. 
The  sample  mean  is  the  worst.  This  is  intuitively  anticipated,  since  when 
the  measurement  error  is  either  absent  or  is  very  small,  the  usual  regression 
estimator  performs  much  better  than  the  standard  estimator,  namely  the 
sample  mean.  The  EB  estimator,  being  a kind  of  weighted  average  of  the 
above  two,  performs  somewhere  in  between. 

(ii)  Next,  when  cr^  is  in  the  interval  [1,3),  the  EB  estimator  performs  the  best  in 
almost  all  the  strata.  The  simple  regression  estimator  follows  next  and  the 
sample  mean  is  the  worst  of  the  three. 


43 


(iii)  When  is  in  the  interval  [3,4),  the  EB  estimator  still  performs  the  best  in 
almost  all  the  strata,  but  now,  the  sample  mean  starts  improving  over  the 
regression  estimator  in  most  of  the  strata. 

(iv)  Lastly,  when  cr^  is  in  the  interval  [4,5),  the  sample  mean  starts  doing  equally 
well  as  the  EB  estimator,  while  the  simple  regression  estimator  slumps  in  all 
the  strata.  The  intuitive  explanation  for  the  same  is  that  with  high  variability 
of  the  covariates  measured  with  error,  any  model-based  procedure  is  bound 

to  provide  relatively  unstable  estimates,  when  compared  to  the  sample  mean, 
but  the  composite  EB  estimator  rectifies  much  of  this  problem. 

The  following  tables  illustrate  the  above  points. 

Table  2-1:  The  sample  sizes  (uj),  the  population  (“true”)  means  (TM),  the  sample 
means  (SM),  the  regression  estimates  (R),  the  empirical  Bayes  estimates  (EB)  and 
the  corresponding  RMSE’s  for  the  12  strata  when  cr^  = 0 


i 

rii 

TM 

SM 

R 

EB 

RMSE(SM) 

RMSE(R) 

RMSE(EB) 

1 

1 

372.06 

371.73 

375.51 

374.75 

9.01 

7.34 

7.29 

2 

5 

968.47 

968.49 

966.58 

967.48 

4.16 

2.92 

3.18 

3 

1 

927.92 

927.78 

926.51 

926.60 

9.70 

2.81 

3.88 

4 

2 

1022.02 

1022.16 

1014.16 

1016.75 

7.03 

8.12 

6.69 

5 

4 

1041.80 

1041.81 

1046.72 

1044.49 

5.14 

5.29 

4.54 

6 

3 

1430.32 

1430.37 

1432.42 

1431.64 

5.17 

3.93 

4.01 

7 

1 

1047.11 

1047.15 

1051.73 

1050.76 

10.75 

5.01 

5.25 

8 

3 

1224.97 

1224.75 

1218.68 

1221.12 

5.97 

6.63 

5.68 

9 

2 

1257.64 

1257.63 

1262.11 

1260.62 

6.70 

5.02 

4.74 

10 

3 

1137.42 

1137.09 

1138.54 

1137.96 

5.30 

2.20 

3.13 

11 

2 

1238.46 

1238.56 

1237.06 

1237.58 

8.08 

2.58 

4.13 

12 

1 

1191.31 

1191.42 

1191.98 

1191.95 

9.57 

2.11 

3.37 

We  see  that  in  strata  2, 3, 6,  7, 10, 11,  and  12,  the  simple  regression  estimator 
performs  the  best  (ie,  has  the  lowest  RMSE).  In  these  strata,  the  EB  estimator 
finishes  a close  second  behind  the  regression  estimator.  The  sample  mean  has  the 
largest  RMSE.  On  the  other  hand,  in  strata  1, 4,  5, 8,  and  9,  the  EB  estimator 
surpasses  the  regression  estimator  with  smallest  RMSE,  and  in  three  of  these 


44 


strata,  namely  stratum  4,  5 and  8,  even  the  sample  mean  has  slightly  smaller 
RMSE  than  the  regression  estimate. 

Table  2-2:  The  population  means  (TM),  the  sample  means  (SM),  the  regression 
estimates  (R),  the  empirical  Bayes  estimates  (EB),  and  the  corresponding  RMSE’s 
for  the  12  strata  when  cr^  = 2.5 


i 

Ui 

TM 

SM 

R 

EB 

RMSE(SM) 

RMSE(R) 

RMSE(EB) 

1 

1 

375.68 

375.68 

372.85 

372.95 

10.15 

8.70 

8.67 

2 

5 

961.23 

961.42 

963.64 

962.83 

4.68 

4.67 

4.17 

3 

1 

920.50 

920.51 

925.56 

924.66 

9.59 

9.90 

8.69 

4 

2 

1013.62 

1013.48 

1012.24 

1012.60 

7.40 

5.98 

5.48 

5 

4 

1045.07 

1045.29 

1044.10 

1044.42 

4.93 

4.06 

3.72 

6 

3 

1428.74 

1429.32 

1429.28 

1429.52 

5.96 

4.94 

4.89 

7 

1 

1045.74 

1045.66 

1049.90 

1049.17 

10.96 

9.20 

7.39 

8 

3 

1216.03 

1216.09 

1215.84 

1216.03 

6.10 

4.63 

4.36 

9 

2 

1260.79 

1260.93 

1259.31 

1259.67 

6.85 

5.70 

5.21 

10 

3 

1138.43 

1138.58 

1135.89 

1136.59 

5.56 

5.12 

4.59 

11 

2 

1231.43 

1231.42 

1234.22 

1233.57 

6.81 

6.00 

5.47 

12 

1 

1191.05 

1191.41 

1187.61 

1188.29 

9.89 

8.86 

7.82 

Here,  we  see  that  in  all  the  strata,  the  EB  estimator  performs  the  best.  But 
for  stratum  3,  the  regression  estimator  is  second  best  while  the  sample  mean  has 
the  worst  performance.  This  shows  that  when  the  error  in  observing  the  true 
covariate  becomes  moderate,  the  EB  estimator  starts  doing  better  than  the  usual 
regression  estimator.  The  sample  mean  is  usually  the  worst. 


45 


Table  2-3:  The  population  means  (TM),  the  sample  means  (SM),  the  regression 
estimates  (R),  the  empirical  Bayes  estimates  (EB),  and  the  corresponding  RMSE’s 
for  the  12  strata  when  cr^  = 4 


i 

Hi 

TM 

SM 

R 

EB 

RMSE(SM) 

RMSE(R) 

RMSE(EB) 

1 

1 

377.50 

377.63 

371.60 

372.21 

8.58 

11.53 

10.51 

2 

5 

964.83 

964.78 

965.38 

965.09 

4.71 

4.66 

4.25 

3 

1 

914.43 

914.21 

923.28 

921.12 

11.76 

11.94 

10.20 

4 

2 

1011.18 

1011.42 

1012.89 

1012.28 

6.69 

7.56 

6.40 

5 

4 

1048.69 

1048.59 

1044.94 

1046.24 

5.06 

6.37 

5.34 

6 

3 

1440.43 

1440.77 

1434.34 

1436.83 

5.58 

8.67 

6.73 

7 

1 

1046.70 

1047.06 

1050.59 

1049.91 

9.59 

8.88 

7.85 

8 

3 

1216.53 

1216.53 

1219.59 

1218.63 

5.60 

6.61 

5.59 

9 

2 

1260.78 

1260.83 

1263.23 

1262.52 

7.84 

7.92 

6.98 

10 

3 

1136.36 

1136.58 

1137.75 

1137.47 

5.38 

6.04 

5.23 

11 

2 

1235.57 

1235.73 

1238.65 

1237.88 

6.56 

8.48 

7.09 

12 

1 

1193.68 

1193.55 

1191.65 

1192.20 

10.26 

10.60 

9.23 

We  notice  that  in  8 of  the  strata,  the  EB  estimator  has  the  lowest  RMSE. 
Also,  in  these  strata,  the  sample  mean  usually  does  better  than  the  regression 
estimator.  In  the  rest  of  the  strata,  the  sample  mean  is  the  best,  followed  by  the 
EB,  while  the  regression  estimator  has  the  worst  performance. 


46 


Table  2-4:  The  population  mean,  the  3 estimators  and  their  RMSE’s  for  the  12 
counties  when  6i  = 5 and  cr^  = 4.5 


i 

rii 

TM 

SM 

R 

EB 

RMSE(SM) 

RMSE(R) 

RMSE(EB) 

1 

1 

378.05 

378.64 

380.06 

379.15 

8.71 

10.62 

9.50 

2 

5 

968.77 

968.65 

968.52 

968.49 

4.74 

4.69 

4.20 

3 

1 

920.85 

921.15 

927.81 

926.06 

9.75 

13.11 

11.15 

4 

2 

1013.37 

1013.49 

1013.78 

1013.50 

7.28 

6.74 

5.88 

5 

4 

1047.52 

1047.47 

1047.46 

1047.44 

5.32 

5.10 

4.62 

6 

3 

1424.31 

1424.37 

1428.96 

1427.75 

6.17 

7.46 

6.56 

7 

1 

1045.23 

1045.36 

1051.73 

1050.56 

10.12 

12.03 

10.20 

8 

3 

1221.85 

1221.67 

1216.95 

1218.65 

5.83 

8.08 

6.54 

9 

2 

1264.03 

1263.65 

1258.45 

1260.12 

6.48 

9.78 

7.81 

10 

3 

1138.16 

1138.32 

1136.89 

1137.40 

5.01 

6.23 

5.23 

11 

2 

1234.64 

1234.50 

1236.17 

1235.62 

6.31 

7.39 

6.07 

12 

1 

1191.43 

1192.16 

1189.53 

1190.20 

9.59 

10.74 

9.32 

Here,  we  see  that  in  half  of  the  strata,  the  EB  estimator  is  the  best  and  in  the 
other  half,  the  sample  mean  is  the  best.  In  either  case,  the  regression  estimator  is 
usually  doing  the  worst  in  all  the  strata. 

This  indicates  that  when  is  high,  the  regression  estimator  is  a bad  choice. 
The  EB  estimator,  which  is  the  compromise  estimator,  does  well  all  through,  being 
the  best  in  the  middle  of  the  interval.  The  sample  mean  starts  off  being  a bad 
choice  when  there  is  little  error  in  measuring  the  covariates,  but  improves  when  this 


error  increases. 


47 


Figure  2-1: 

GRAPH  : R = sum(mse.y)  / sum(mse.our.estimate)  against  different  values  of  sigma2.eta  with  b1  = 5 


Figure  2.1  further  validates  these  findings.  We  define 

p = E{yi  — 7i)^/  Xli=i  ~ 7i)^i  S’lid  observe  the  behavior  of  this  ratio 

over  different  values  of 

We  see  that  when  cr^  < 4.5,  then  p > 1 which  means  that  the  EB  estimator  is 
doing  better  than  the  sample  mean  in  the  overall  picture,  over  all  the  strata.  When 
> 4.5,  we  see  that  p keeps  falling  below  1 which  means  that  the  sample  mean 
starts  doing  better. 


48 


Figure  2-2: 

The  effect  of  increasing  sampie  sizes  on  R,  keeping  b1  = 5 and  sigma2.eta  = 5 


However,  in  figure  2.2,  we  see  that  if  the  sample  size  increases  and  thus  if  the 
sampling  fraction,  defined  as  Ui/Ni  increases,  then  p picks  up  again  and  the  EB 
estimator  starts  doing  better  than  the  sample  mean. 

Thus,  we  conclude  that  the  EB  estimator  performs  very  well  up  until  cr^  gets 
close  to  bi.  That’s  when  the  sample  mean  starts  doing  better.  However,  if  at  this 
point,  the  sample  size  is  increased,  the  EB  again  takes  over. 


CHAPTER  3 

EMPIRICAL  AND  HIERARCHICAL  BAYES  ESTIMATION  IN  FINITE 

POPULATION  SAMPLING 

UNDER  STRUCTURAL  MEASUREMENT  ERROR  MODELS 

3.1  Introduction 

In  the  previous  chapter,  we  considered  EB  estimation  based  on  functional 
measurement  error  models.  In  this  chapter,  we  develop  EB  and  HB  procedures  for 
simultaneous  estimation  of  finite  population  strata  means  for  the  structural  model, 
i.e.  when  the  covariates,  say  x,  are  measured  with  error. 

EB  estimators  (or  more  appropriately  predictors)  for  strata  means  are  devel- 
oped in  Section  3.2.  This  section  also  contains  estimation  of  the  superpopulation 
parameters  taking  into  account  the  possibility  that  the  covariates  are  measured 
with  error.  Section  3.3  proves  the  “asymptotic  optimality”  of  EB  estimators  in  the 
sense  of  Robbins  (1956).  The  HB  estimators  are  developed  in  Section  3.4.  Also,  in 
this  section,  we  have  established  the  propriety  of  the  posteriors,  and  have  discussed 
the  Markov  chain  Monte  Carlo  implementation  of  the  proposed  hierarchical  Bayes 
procedure.  A simulation  study  is  conducted  in  Section  3.5  to  compare  the  perfor- 
mances of  the  EB  and  HB  estimators.  Analysis  of  a real  life  data  is  undertaken 
in  Section  3.6  to  compare  the  different  methods  that  are  proposed.  The  proofs  of 
certain  technical  results  are  deferred  to  the  Appendix. 

With  one  stratum,  Bolfarine  and  Zacks  (1992)  considered  a measurement  error 
model  similar  to  ours.  However,  their  main  objective  was  Bayesian  estimation  of 
the  superpopulation  parameters.  Also,  they  assumed  the  variance  components 
to  be  known,  and  provided  a normal  approximation  of  the  posterior  distribution. 


49 


50 


Our  objective  is  to  estimate  instead  the  strata  means.  More  importantly,  the  EB 
procedure,  as  introduced  here,  estimates  all  the  hyperparameters  including  all  the 
variance  components,  and  does  not  require  any  approximation  of  the  posterior. 

The  HB  procedure  also  does  not  rely  on  any  normal  approximation.  Bolfarine  and 
Sandoval  (1990)  provided  Bayesian  estimates  of  the  finite  population  mean  when 
the  superpopulation  mean  is  measured  without  error,  and  a certain  variance  ratio  is 
known.  Also,  they  used  a noninformative  prior  which  is  different  from  ours. 

3.2  EB  Estimators 


Suppose  there  are  m strata  labeled  1,. . . ,m  and  let  Ni  denote  the  known 
population  size  for  the  stratum.  We  denote  by  the  response  of  the 
unit  in  the  2*^  stratum  (j  = 1, . . . , i = 1, ...  ,m).  A sample  of  size  is 
drawn  from  the  2*^  stratum.  Without  loss  of  generality,  we  denote  the  sampled 
units  by  1, 2, . . . , n*  (2  = 1, . . . , m).  Throughout,  we  will  use  the  notations 

= {y^ni+^,■■■,yiNiV,  yj  = = 


'1+1  ’ 

, ■ ■ ■ , ym'^'  )>  ■ • ■ > and  The 


(1)^ 


basic  problem  in  finite  population  sampling  is  inference  about  conditional  on 
yd)  More  specifically,  in  this  chapter,  we  will  be  interested  in  the  estimation  (more 
appropriately  prediction)  of  finite  population  means  7^  = yij  (*  ~ 

1 , . . . , m)  given  the  data. 

We  assume  the  superpopulation  model 


yij  = bo  + biXi  + Ui  + 6ij  {j  = l,...,Ni;  i = l,...,m) 
^ij  + yij  {.3  f)  • • ■ ) A(^,  2 1,  . . . , /7T-) 


(3.1) 

(3.2) 


It  is  assumed  that  the  Xi,Ui,6ij  and  rjij  are  mutually  independent  with  Xi  ~ 

Ui  N{0,al),  6ij  A'(0,CTg)  and  rjij  A'(0,  cr^).  The  available 


51 


data  consist  of  {yij,Xij),  {j  = 1, . . . i = 1, . . . ,m).  Also,  we  write  0 = 


Clearly  (3.1)  is  a random  effects  model.  An  alternative  way  of  expressing  the 
same  is 

yij  ~\~  ^ij  1 ^0  A j ' j ^ f j ' ’ ' ) 

In  this  way,  it  is  possible  to  identify  (3.1)  as  a Bayesian  model.  Throughout  this 
article,  we  will  use  the  Bayesian  terminology,  although  the  EB  estimators  to  be 
developed  in  this  section  can  also  be  viewed  as  empirical  best  linear  unbiased 
predictors. 

Now,  writing  1„.  as  the  n*  dimensional  column  vector  of  I’s,  = l^iln 

\f  1 

Vi 

and  J„.  as  the  identity  matrix  of  order  n^,  |0  follows  bivariate  normal 

distribution  with  parameters 

^ [bo + biXi)ln,  ^ f allni  + {al  + blal)Jn,  {<^1  + bjcrl)  In  A N^-n. 

{bo  + biXi)lN,-m  y y (^7^  + + bl(rl)JN,-ni  j 


( 

'll 

y 1 

(2) 

\y.  ) 

10 

Then  the  Bayes  predictor  of  given  and  0 reduces  after  some  simplifica- 
tion to 

E{yf'\vT'  —(bo  + biyx)lNi-m  + (o’l  + b\a‘l)lNi-nX'^^ 

+ {(tI  + b\al)Jn,]~\yf^  - (bo  + fclMx)lnJ 
==[(1  — Bi)y\  ^ 3-  Bi(bo  + biyxjjlj^^.-n^,  (3-4) 


where  yl  ’ = ^ J2%i  Vij  and  Bt  = (i  = 1,  ■ ■ • , m).  The  above 

predictor  can  be  viewed  also  as  the  best  linear  unbiased  predictor  of  yf^  given 


52 


This  leads  to  the  Bayes  predictor  of  ji  as 

if  =E[-/,\yf^] 

=(1  “ + fiBi{bo  + bi/ix),  (3.5) 

where  fi  = {Ni  — ni)/Ni  is  the  finite  population  correction  fraction.  For  simplicity, 
henceforth,  we  will  denote  by 

In  an  EB  scenario,  the  components  of  cf)  are  unknown  and  need  to  be  estimated 
from  the  data.  We  write  = n~^  YfjU  ^ij , SSWx  = YnUY^jLiiE^ij  ~ 

Xi)“^,  SSWy  = YfA=iYff=i{yij  ~ Then  the  minimal  sufficient  statistic  is 
{yi,  - ■ ■ ,ym,SSWy,Xi,.. . ,Xm,SSWx).  Also,  N{biy^,  cr^/nj  + + 

b\al),  X^^  N{ii,,ayn,  + al),  55W,  ~ and  SSWx 

where  ut  = Yff=i  assumed  to  be  bigger  than  m.  Simple-minded  initial  estimator 
of  bi  is  given  by 

m m 

i=l  i=l 

The  following  theorem  shows  that  bi  is  typically  an  inconsistent  estimator  of 
bi.  Let  gm  = riT-  YaU 

Theorem  3.2.1  Assume  (i)  maxj<i<a,ni  < K < oo  and  (ii)  gm/{'m  — 1)  ^ c as 
m^oo.  Then  E{bi)  ^ bical/{a‘^  + cal). 

Remark  3.1  : Assumption  (i)  is  very  natural  in  a small  area  context.  Assumption 
(ii)  also  holds  when  the  Uj  do  not  differ  significantly.  In  particular,  when  ni  = 

■ ■ ■ = nm  = n,  than  y^/ (m  — 1)  = n. 

The  proof  of  the  theorem  is  deferred  to  the  Appendix.  The  theorem  says  that 
the  regression  coefficient  bi  converges  in  probability  to  a fraction  multiplier  of  bi. 
Thus,  the  regression  coefficient  is  attenuated  by  measurement  error.  The  theorem 
also  implies  that  a consistent  estimator  of  bi  is  obtained  when  one  multiplies  bi  by 
a consistent  estimator  of  (cr^  + cal) /cal. 


53 


To  this  end,  we  observe  that  since  SSWx  ~ ^‘^iXnr-rrv  writing  MSWx  = 
SSWx/{riT  — rn),  E{MSWx)  = and  V{MSWx)  = 2a^/{riT  — m).  Hence,  if 
nT  — m — > oo  as  m ^ oo,  MSWx  cr^-  Next,  writing  SSBx  = YlT=i  ~ 
one  can  write  SSBx  = ~ u.mU^)Z^,  where  • ■ • i 

• • ■ > S'lid  Im  is  the  identity  matrix  of  order  m.  Now, 

conditional  on  a;  = (xi, . . . ,Xm)'^,  Z„  has  the  mean  vector  {y/nSxx, . . . , -Jn^Xmf 
and  the  variance-covariance  matrix  a^Im-  By  the  symmetry  And  idempotency  of 
Im  - UmuJ^,  SSBx\x  ~ where  = ^{^/i^Xl, . . . , - 

Umul^){^/n[xim  ■ ■ , ^/n:^^Xm)  = ~ 

We  now  state  the  following  lemma,  the  proof  of  which  also  is  deferred  to  the 
Appendix. 

Lemma  3.2.1  Assume  conditions  (z)  and  (ii)  of  Theorem  3.2.1.  Then  E\Yfff.^  ni{xi— 
xfl{m  - 1)]  ca\  and  rii{xi  - xf/i^rn  - 1)]  ^ 0. 

In  view  of  Lemma  3.2.1,  writing  MSBx  = SSBx/{rn  — 1),  E{MSBx)  = 
EE{MSBx\x)  = (j^[m  - 1 + ~ “ 1)  ^ as  m ^ oo, 

while  V(MSBx)  = E[V{MSBx\x)]  + V[E{MSBx\x)]  = a^E[2{m-l)  + 4:J2ni{xi- 
x)^/(j^]/(m—  l)^  + cr^H[l  + ^”;^nj(xj  -x)^/('m  — 1)]  ^ 0 + 0 = 0 as  m ^ oo.  Thus, 
under  the  assumptions  of  Theorem  3.2.1,  if  oo  as  m — > oo,  {a^  + caD/ca^ 

is  consistently  estimated  by  MSBx /{MSBx  — MSWx)  = (1  — M SWx / M S Bx)^^ ■ 
Thus,  bi  and  bo  are  consistently  estimated  by 

Si  = (1  - MSWx/MSBx)~%, 


bo  = y - boX. 


(3.6) 


Next,  we  need  consistent  estimators  of  the  B^,  where  Bi  is  defined  after  (3.4). 
First,  writing  MSWy  = SSWy/{nT  — m),  E{MSWy)  = and  V{MSWy)  = 
2a\l{nT  — m)  0 as  m ^ oo  when  n^  — m— >ooasm— »oo.  Thus,  is 


oo  as  m 


54 


consistently  estimated  hy  MS Ey  \i  nr  — rn  oo  as  m — > oo.  Next,  defining 
MSBy  = Y1T=i  ~~  y)^/(^  ~ 1))  we  calculate 

E{MSBy)  = E[E{MSBy\x)] 

m 

= E[al  + h\'Y^ni{xi  - xf/{m  - 1)  + crlgmUm  - 1)] 

1 

^ (^l  + c{h\al  + al),  (3.7) 

by  Lemma  3.2.1.  We  will  prove  next  the  following  theorem. 

Theorem  3.2.2  Assume  conditions  (i)  and  (ii)  of  Theorem  3.2.1  and  (Hi)  Ut  — 
m — > oo  as  m —>■  oo.  Then  V{MSBy)  — > 0 as  m oo. 

The  proof  of  this  theorem  is  deferred  to  the  Appendix.  In  view  of  this  theorem 
and  (3.7),  we  estimate  ^ (say)  consistently  by  ^rn  = max[0,  {MSBy  — 

MSWy){m  — l)/9m]-  The  introduction  of  0 is  simply  to  overcome  the  fact  that 
MSBy  — MSWy  can  assume  negative  values  with  positive  probability.  Now  Bi  is 
estimated  consistently  by 

Bi  = MSWy /{MSWy  + mU),  (^  = 1, . . . , m).  (3.8) 

The  EB  predictor  of  7^  is  thus  given  by 

= {I-  hBi)y,  + UB,{k  + kX) 

= (1  - fiBi)yi  + fiB,y,  i = l,...,m.  (3.9) 

Remark  3.2  : It  may  be  noted  that  with  the  present  method  of  moments  esti- 
mators of  the  superpopulation  and  prior  parameters,  the  EB  estimator  of  7 is  the 
same  as  in  the  situation  when  the  covariates  are  measured  without  error.  One 
may  view  this  as  a deficiency  of  the  proposed  EB  estimator,  but  it  seems  possible 
to  overcome  this  deficiency  by  considering  the  MLE’s  of  bo,  bi  and  the  variance 
components.  However,  then  one  may  have  to  sacrifice  direct  analytic  evaluations, 
and  rely  instead  only  on  numerical  findings. 


55 


3.3  Asymptotic  Optimality  of  the  EB  Predictor 


We  first  compute  the  Bayes  risk  of  the  EB  predictor  = ('yf'®, . . • 
of  7 = (71, . . . ,^rn)'^,  the  vector  of  population  strata  means,  i.e.  we  compute 
m~^  1 — 7i)^.  For  this,  we  begin  with  the  identity 

m mm 

i=\  i=\  i=l 

which  holds  since  - 7i)]  = - ji\y)]  = 0. 

We  first  note  that 


m 


Y -li?  = m 1 E {j,\y[ 


(l)^  _ .^-1 


m 


(3.11) 


2 = 1 


2=1 


2=1 


Straightforward  but  tedious  calculations  yield 


y{y?^\yi^^)  = crliN,-ni  + 


+ bjal) 

(^e  + + bl<^x) 


J Ni-Ui- 


(3.12) 


Hence,  from  (3.10)  and  (3.11), 


i=l 


This  leads  to 


m 


m 


E 

2=1 

m 


2 , ,„^2  <rl{al  + blal) 


{Ni  - rii)a^  + (Ni  - n,) 

u-f 


N, 


-1 


+ bl 


cz 


2=1 


C^e  + + ^Wx) 


ni{al  + h\al)_ 
(3.13) 


i=l 


m 

i=l 


N. 


-1 


fi{ol  + hlal) 

<^l  + ni{al  + h\al) 


m 

+ m-'X^E;(7f^-i^)2.  (3.14) 

i=l 

We  will  next  prove  the  following  theorem  which  establishes  the  asymptotic 
optimality  (cf.  Robbins,  1956)  of  EB  estimators  under  certain  conditions. 

Theorem  3.3.1  Assume  {i)  minj<i<m«i  > 1,  (m)  naaxj<i<m  n,  < K < 00, 

{iii)  riT  — m ^ 00  as  m ^ 00.  Then  m~^  YZh=i  ~ 0 as  m ^ 00. 


56 


The  proof  of  this  theorem  is  deferred  to  the  Appendix. 

3.4  HB  Predictors 


Next  we  consider  a hierarchical  Bayesian  framework  to  predict  the  population 
strata  means  7,  (i  = 1, ...  ,m).  To  this  end,  we  begin  with  the  following  model  : 

Stage  1.  Uij  = 9i  + Cij  (j  = 1, . . . , Uj;  i = 1, . . . ,m)  where  are  iid  N{0,  al). 

Stage  2.  9i  = bo  + biXi  + u*  {i  = 1, . . . ,m)  where  Ui  are  iid  ^"(0,  cr^) 

Xij  = Xi  + r)ij  (j  = 1, . . . , n*;  i = 1, . . . , m),  where  are  iid  iV(0,  cr^). 

Stage  3.  Xi  al) 

Stage  4.  60,  bi.^xi  are  mutually  independent  with  bo,  61  and  iid 

uniform(-oo,oo);  cr^  ~ /G(|ae,  |6e),  cr^  ~ /G(|a„,d6„),  cr^  ~ IG{\ar,,\br^), 
a1  ~ IG{^ax,  ^bx),  where  lG{a,P)  denotes  an  inverse  gamma  distribution 
with  pdf  fa,piz)  oc  exp{-a/ 

The  first  thing  is  to  check  the  propriety  of  the  posterior  under  the  given  prior.  The 
following  theorem  is  proved.  We  will  write  b = (60,61)^;  zj  = [l,Xi). 

Theorem  3.4.1  Assume  ae,a„,a^,aa;  all  positive.  Also,  let  be  + rix  — m > 0, 
bu  + m — p > 0 and  b^  + m — 1 > 0.  Then  the  joint  posterior  is  proper. 

Proof  The  joint  posterior  is  given  by 


7T 


{e,x,b,fi^,al,al,a^,al\y,X)  oc  (cr^)  ^^^^exp 


X {aly'^^^^exp 


i=l 

-pfl  E(«<  - 46) 

“ j=l 
1 / "^ 

^ Y,n,{Xi-XifTSSW^ 


V ■ 1 

^ \ Z=1 


2^'  V.=i 


X {alr^f^exp 


2al 


'y  hx) 


i=\ 


^ir^- 


/ 2 \ 11—1 

X [a  ) 2 exp 


exp 

CLq 

/ 0\_^_ 1 

«)  2 exp 

exp 

Ctrl 

((T^)  2 exp 

r^x 

57 


First  integrating  out  with  respect  to  and  noting  that  exp[— ^ ~ — 

1, 


TT  [e,x,h,al,al,a‘L,al\y,X]  < Kexp 


1 


2^e  V ■ 1 
^ \ 1=1 


m \ 

Y,rHiy,~e,)^  + SSWy\ 


X exp 


X exp 


" 1=1 


be+ny  _ 

X (cTg)  2 -"exp 

X (cr^j  2 ^exp 


Qjq 


2(j2 


o bu+m  I 

(cr„)  2 exp 


exp 


2(^2  j 


where  in  the  above  and  in  what  follows,  K{>  0)  is  a generic  constant. 

Next,  writing  = (lm,a;), 

m 

^{9i  - xjbf  = h^{XlX,)-^h  - 2h^x^6  + e'^9 

i=l 

=[h  - {x^^x.)-^xlenxlx.)-\h  - [xlx.r^xle]  + 0^(i  - Px,)e 


where  Px,  = X*(XfX„)  Now,  integrating  with  respect  to  h and  using 

0^'(/-Px.)0>O, 


[e,x,al,al,(T‘l,al\y,X]  < Kexp 


J=1 


2^2 

XrX*r^/2  X exp 


1 


-^iJ2^i{Xi-Xif  + SSWx 

^ \i=l  / 


2\  1 
X (cTg)  2 "exp 

Q-e 

, o,  b7<.+xn  — 2 -1 

{aj  2 exp 

. 2fj2_ 

2\  ta+rx  1 
X (cr^j  2 "exp 

drj 

Ox  6a;+Tn.-l  -i 

{ai)  2 "exp 

. 2^x2  _ 

.“R. 

58 


Next,  integrating  with  respect  to  0,  it  follows  from  (4.3)  that 


7T 


(x,al,al,alj,al\y,x'^  < K\X'^ X x exp  ni(Xi  - 


be+nj.-m 

X (ag)  2 exp 


2'  S+"T  1 
X (cr,j)  2 exp 


n . 6it4~TTi — 2 •] 

' 2 ^exp 


Ov  bx+m-1  -j 

' 2 exp 


2a2 


Next,  since  Oe  > 0,  > 0 and  > 0,  integrating  out  with  respect  to  a^,  and 


cr^,  one  gets 


7T 


< K\XlX,\-^/^ 

2\-hz±ni:_i 

X ^ 


X exp 


1 f ^ 

-AY,n^{Xi-Xif  + SSW^ 


\ 1 
'I  \2=1 


We  now  observe  that  |XfX.|  = rnYAAi  ~ Also,  from  (4.4),  conditional 
on  (7^, y and  X,  Xj  N{Xi,a'^/ni)  {i  = l,...,m).  Now,  writing  u = 

(v^xi, . . . , ^/n^Xmf  and  U = {y/n^Xi, . . . , ^/n^Xm)'^,  u ~ N{U,  Aim)-  Then, 
if  D~^  = Diag{^/h^, . . . , YhLiA  ~ - m~^Jm)x  = u^Au, 

where  A = D^^‘^{Im  — m~^Jm)D^^‘^-  Since  rank(A)  = m — 1,  by  the  spectral 
decomposition  theorem,  we  can  write  A = YA=\  where  Aj  are  the  non-zero 

eigen-values  of  A and  . . . ,^m-i  are  the  corresponding  orthonormal  eigen-vectors. 
Now,  vF Au  = YAAi  AiAY  where  N{^JU,A).  Hence, 


XjX^  exp  - ^ ^ ni{X^  - x^ 

i=l 


= E 


m— 1 


m 


i=l 


1 -1/2 


-1 


-u^u 


< rn  ~^\Jir,E{xm-i) 


59 


where  A^m  = min{Xi, . . . , Xm-i)-  Hence,  by  (4.5)  and  (4.6),  integrating  with 
respect  to  x,  vr  (cr^,  \y,X)  < ^exp  ■ 

It  is  clear  that  vr  (cr^|y,  X)  < oo  since  5,,  + rix  — m > 0. 

The  implementation  of  the  Bayesian  procedure  is  greatly  facilitated  by  the  Markov 
chain  Monte  Carlo  numerical  integration  technique,  in  particular,  the  Gibbs 
sampler.  This  requires  generating  samples  from  the  full  conditionals  of  each  of 
0,  X,  b,  /ij,,  cTg,  cr^,  (T^  and  cr^  given  the  remaining  parameters  and  the  data.  The 
details  are  given  below. 

(i)  [e,\x,  b,  11^,  al,  al,  a^,  a^,  y,  X]  N((l  - Bi)yi  + Bixjb,  {al/ni){l  - B,)), 


where  Bi  = {al/rii)/{al/ni  + <); 


(ii) 

9 9 

,<rly,X] 

‘5?n 

rii  + cr' 

h{0i  - 

bo)  + ^riiXi  + i 

(hi) 

[b\e,x,y.:,,al, 

9 9 

~ N (( 

XfX^)-^X^9,, 

(iv) 

[li^\e,b,x,al, 

9 9 

~ N(x 

(v) 

[al\e,b,x,y.^, 

9 9 

rs-< 

IG((l/2)(n^  + M,(l/2)(Er=i 

riiivi  - 

- 0,)2  + SSWy  + 

(vi)  [al\9,b,x,iJ.^,a^^,a^,al,y,X]  ~ IG((l/2)(m  + 6„),  (l/2)(EI^i(0i-zfb)^  + a„); 

(vii)  [a'^\e,b,x,iJ,:„al,al,al,y,X]  ~ 

IG  ((l/2)(nr  + 6^),  (l/2)(^™  ^ ni(Xj  - Xi)^  + a^); 

(viii)  [al\e,b,x,y.^,al,al,a‘^,y,X]  ~ IG((l/2)(m  + b,,),  (l/2)(^™i(xi -y,,)2  + a,,)). 

We  generate  several  sets  of  these  samples.  For  the  generated  set,  we  obtain 
the  HB  estimate  : = (1  — fiBf^)yi  + corresponding  Bayes 

risk  : {a^f  {U/Ni  + /^(l  - Bf)/ni).  After  burning  out  the  first  half  (to  eliminate 
any  possible  unstability  in  the  initial  generated  samples),  we  use  the  averaging 
principle  and  take  the  average  of  the  HB  estimates  over  all  the  remaining  sets  to 
obtain  the  final  HB  estimate.  The  same  thing  is  done  for  the  Bayes  risk. 


60 


3.5  Simulation  Study 

We  conducted  a simulation  study  to  compare  the  performance  of  the  HB  and  the 
EB  estimators  in  comparison  with  the  sample  mean.  To  this  end,  we  created  a 
finite  population  of  size  1,400  spread  across  12  strata  of  sizes  50,  250,  50,  100,  200, 
150,  50,  150,  100,  150,  100  and  50.  The  responses  were  generated  under  the 
super-population  model  as  considered  in  this  study  with  bo  = 100,  bi  = 2,  aj  = 100, 
= 16,  = 25,  fix  = 194  and  = 2,  737.  A 2%  simple  random  sample  was  used 

to  generate  samples  from  each  stratum.  Accordingly,  the  sample  sizes  for  the  12 
strata  are  given  respectively  by  1,  5,  1,  2,  4,  3,  1,  3,  2,  3,  2 and  1. 

We  drew  400  independent  samples  (y^,  Wjj)  {j  = 1,  ■ • • , = 1,  ■ ■ • ,12) 

from  this  population, and  found  for  each  sample  the  sample  mean,  the  EB  and  the 
HB  estimators.  To  obtain  the  HB  estimators,  we  ran  a Gibbs  chain  of  size  10,000 
with  a burn-in  of  the  hrst  5,000.  The  HB  estimators  of  the  population  means  ji 
are  the  averages  over  the  remaining  5,000  Gibbs  samples  generated.  Moreover,  we 
took  the  average  of  the  squared  differences  of  the  estimators  from  the  true  means 
over  the  400  simulations  and  took  their  squared  roots  to  obtain  the  root  mean 
squared  errors.  The  resulting  values  for  the  sample  means,  EB  and  HB  estimators 
are  reported  as  RMSE(SM),  RMSE(EB),  and  RMSE(HB)  respectively. 

Table  3-1  reports  the  sample  sizes,  the  true  means(TM),  the  sample  means 
(SM),  the  EB  and  the  HB  estimators  as  well  as  RMSE(SM),  RMSE(EB),  and 
RMSE(HB)  for  the  12  counties.  The  counties  are  denoted  by  z = 1,  • ■ • , 12  in  the 


table. 


61 


Table  3-1:  The  sample  sizes,  means  and  RMSE’s  for  the  12  counties 


i 

Tli 

TM 

SM 

EB 

HB 

RMSE(SM) 

RMSE(EB) 

RMSE(HB) 

1 

1 

127.32 

127.16 

130.14 

131.90 

8.76 

6.99 

6.88 

2 

5 

117.41 

117.25 

118.99 

121.53 

4.46 

4.77 

4.13 

3 

1 

152.10 

152.12 

147.32 

145.30 

9.01 

8.73 

8.24 

4 

2 

160.24 

160.05 

155.54 

151.39 

6.82 

7.88 

10.62 

5 

4 

127.41 

127.57 

128.57 

130.75 

4.61 

4.43 

5.08 

6 

3 

140.51 

140.62 

140.11 

139.89 

5.20 

4.61 

3.92 

7 

1 

141.34 

141.15 

139.77 

139.59 

9.95 

7.21 

5.65 

8 

3 

155.97 

156.40 

153.69 

151.47 

5.77 

6.01 

6.94 

9 

2 

131.61 

131.46 

132.50 

134.50 

6.59 

5.48 

5.25 

10 

3 

137.18 

136.21 

136.29 

136.82 

5.58 

4.93 

4.09 

11 

2 

148.14 

148.01 

145.86 

143.94 

6.62 

6.16 

6.46 

12 

1 

133.43 

133.99 

134.92 

135.58 

8.82 

6.45 

5.27 

It  follows  from  the  above  table  that  according  to  the  RMSE  criterion,  the  HB 
estimator  is  doing  better  than  both  the  sample  mean  and  the  EB  estimator  in  8 
out  of  the  12  counties.  In  most  of  these  counties,  the  EB  estimator  is  second  best 
and  the  sample  mean  is  the  worse,  having  the  largest  RMSE.  The  exceptions  are 
counties  4,  5,  and  8.  The  sample  mean  has  the  smallest  RMSE  in  counties  4 and  8 
while  the  EB  estimator  has  the  smallest  RMSE  in  county  5. 

In  the  following  section,  we  conduct  a data  analysis  and  compare  the  perfor- 
mance of  the  HB  estimator  with  that  of  other  standard  estimators. 

3.6  Data  Analysis 

We  use  the  data  used  by  BHF  (1988)  for  analysis.  Knowledge  of  the  area 
under  different  crops  is  important  to  the  US  Department  of  Agriculture.  Sample 
surveys  have  designed  to  estimate  crop  areas  for  large  regions,  such  as  crop- 
reporting districts,  individual  states,  and  the  United  States  as  a whole.  Predicting 
crop  areas  for  small  areas  such  as  counties  has  generally  not  been  attempted, 
due  to  a lack  of  availability  of  data  from  farm  surveys  for  these  areas.  The  use 
of  satellite  data  in  association  with  farm-land  survey  observations  has  been  the 


62 


subject  of  considerable  research  over  the  years.  In  their  paper,  BHF  considered 
data  for  12  counties  in  Iowa,  obtained  from  the  1978  June  Enumerative  Survey  of 
the  USDA  as  well  as  from  the  satellite  LANDSAT  during  the  1978  growing  season. 
The  purpose  was  to  predict  the  area  under  soybean  and  corn  in  these  counties. 

As  we  have  discussed  before,  BHF  developed  a variance  components  model  for 
small  area  estimation  and  they  provided  analysis  of  the  soybean  data  (reported  by 
farmers)  using  two  covariates,  corn  and  soybean  (reported  by  satellite).  Below  is  a 
part  of  the  actual  data. 


Table  3-2;  Survey  and  Satellite  Data  for  Soybeans  in  12  Iowa  counties 


County 

Number  of 
Sample 

Segments 

County 

Reported  Hectares 
Soybeans 

Number  of  Pixels  in 
sample  segments 
Soybeans 

Mean  Number  of 
pixels  per  segment 
Soybeans 

Cerro  Gordo 

1 

545 

8.09 

55 

189.70 

Hamilton 

1 

566 

106.03 

218 

196.65 

Worth 

1 

394 

103.60 

250 

205.28 

Humboldt 

2 

424 

6.47 

96 

220.22 

63.82 

178 

Franklin 

3 

564 

43.50 

137 

188.06 

71.43 

206 

42.49 

165 

Pocahontas 

3 

570 

105.26 

218 

247.13 

76.49 

221 

174.34 

338 

Winnebago 

3 

402 

95.67 

128 

185.37 

76.57 

147 

174.34 

204 

Wright 

3 

567 

37.84 

77 

221.36 

131.12 

217 

124.44 

258 

Webster 

4 

687 

144.15 

303 

247.09 

103.60 

221 

88.59 

222 

115.58 

274 

Hancock 

5 

569 

99.15 

190 

198.66 

124.56 

270 

110.88 

172 

109.14 

228 

143.66 

297 

Kossuth 

5 

965 

91.05 

167 

204.61 

132.33 

191 

143.14 

249 

104.13 

182 

118.57 

179 

Hardin 

6 

556 

102.59 

262 

177.05 

29.46 

87 

69.28 

160 

99.15 

221 

143.66 

345 

94.49 

190 

63 


BHF  have  noted  : “the  second  segment  in  Hardin  county  deviated  from  other 
observations.  The  reported  hectares  of  corn  for  the  second  segment  were  identical 
to  that  of  the  first  segment.  Therefore,  all  data  for  that  (second)  segment  are 
deleted  from  our  analyses.  The  soybean  data  are  deleted  for  convenience”.  So, 
all  their  results,  and  subsequently  ours,  are  based  on  the  data  after  deleting  that 
segment. 

We  show  below  the  fitted  model  as  obtained  by  BHF  in  their  analysis. 


Vij  = —16  + .028xiy  + A94x2ij 


where  yij  is  the  soybean  acreage  reported  by  farmers,  x\  refers  to  the  corn  pixels 
and  X2  to  soybean  pixels  (reported  by  satellite).  It  turns  out  that  = 195, 
al  = 272. 

In  Table  3-3  below,  we  reproduce  the  results  from  BHF’s  work.  It  shows  the 

predicted  hectares  of  soybean,  with  standard  errors  of  alternative  predictors.  In 

particular,  it  provides  a quick  comparison  of  standard  errors  of  the  best  predictor 

(as  derived  by  BHF),  the  survey  regression  predictor  and  the  sample  mean. 

Table  3-3:  Predicted  Hectares  of  Soybean  With  Standard  Errors  of  Alternative 
Predictors 


STANDARD 

ERRORS 

Sample 

County  Segments 

Predicted 

hectares 

Best 

predictor 

Survey  regression  Sample 
predictor  mean 

Cerro  Gordo 

1 

77.8 

12.0 

15.6 

29.1 

Hamilton 

1 

94.8 

11.8 

14.8 

29.1 

Worth 

1 

86.9 

11.5 

14.2 

29.1 

Humboldt 

2 

79.7 

9.7 

11.1 

20.6 

Franklin 

3 

65.2 

7.6 

8.1 

16.8 

Pocahontas 

3 

113.8 

7.7 

8.2 

16.8 

Winnebago 

3 

98.5 

7.7 

8.3 

16.8 

Wright 

3 

112.8 

7.8 

8.4 

16.8 

Webster 

4 

109.6 

6.7 

7.0 

14.6 

Hancock 

5 

101.0 

6.2 

6.5 

13.0 

Kossuth 

5 

119.9 

6.1 

6.3 

13.0 

Hardin 

5 

74.9 

6.6 

6.9 

13.0 

64 


They  commented  that  the  survey  regression  predictor  compares  favorably  well  with 
that  of  the  best  predictor  for  each  county. 

The  next  table  is  similar  to  the  above  table.  However,  herein,  we  consider 
only  soybean  pixels  as  the  covariate.  This  is  due  to  the  fact  that,  based  on  the 
p- values  of  the  slopes  in  their  model,  BHF  have  observed  that  “only  the  coefficient 
of  soybean  pixels  is  signihcantly  different  from  0 for  the  soybean  function” . This 
motivated  us  to  develop  our  model  and  data  analysis  taking  only  one  covariate  viz., 
soybean  pixels  for  prediction  of  soybean  hectares. 

Thus,  in  this  next  table,  we  show  the  BHF  predictors  for  each  county  (and 
their  standard  errors)  based  only  on  one  covariate,  viz.,  soybean  pixels. 

Now,  the  fitted  model  is  given  by 

yij  = -3.12  + .472x2ij 
Further,  = 191.14  and  = 259.93. 

This  time,  it  is  observed  that  the  performance  of  the  survey  regression  predictor 
drifts  further  away  from  the  best  predictor.  We  also  tried  other  competitors  such  as 
the  simple  linear  regression-based  estimators  but  the  results  were  not  satisfactory. 
Hence,  these  are  not  shown  in  the  table. 


65 


Table  3-4:  Predicted  Hectares  of  Soybean  With  Standard  Errors  of  Alternative 
Predictors,  using  soybean  pixels  as  the  only  covariate 


STANDARD 

ERRORS 

Sample 

County  Segments 

Predicted 

hectares 

Best 

predictor 

Sample 

mean 

Cerro  Gordo 

1 

77.19 

12.45 

29.1 

Hamilton 

1 

93.64 

11.93 

29.1 

Worth 

1 

86.70 

11.97 

29.1 

Humboldt 

2 

80.18 

9.69 

20.6 

Franklin 

3 

65.03 

7.73 

16.8 

Pocahontas 

3 

113.17 

7.71 

16.8 

Winnebago 

3 

97.23 

7.76 

16.8 

Wright 

3 

113.31 

7.86 

16.8 

Webster 

4 

109.75 

6.76 

14.6 

Hancock 

5 

100.94 

6.23 

13.0 

Kossuth 

5 

120.24 

6.11 

13.0 

Hardin 

5 

74.78 

6.57 

13.0 

It  is  thus  confirmed  that  the  deletion  of  the  other  covariate  (corn  pixels)  has  not 
affected  the  performance  of  the  best  predictor.  This  agrees  with  the  observation 
that  the  coefficient  for  corn  pixels  in  the  full  model  is  insignificant. 

In  the  next  table,  we  provide  the  predicted  hectares  and  estimated  standard 
errors  of  the  predicted  hectares  for  each  county  according  to  our  approach  vis-a-vis 
the  BHF  approach. 

At  this  stage,  it  must  be  noted  that  we  have  considered  a model  different  from 
theirs  in  the  sense  that  we  incorporated  possible  measurement  errors  in  the  values 
of  the  covariates.  Thus,  this  time,  before  we  apply  any  method  towards  analyzing 
the  data,  we  note  that  the  x-observations  are  subject  to  measurement  error. 

The  whole  purpose  is  to  view  x\jS  as  random  and  generate  “copies”  of  these  values 
so  that  the  computations  for  the  BHF  predictor  can  be  repeated  over  and  over 
again  and  finally  the  estimates  and  their  standard  errors  may  be  computed  using 
the  averaging  principle.  Details  used  to  obtain  the  BHF  predictors  and  their 
standard  errors  are  given  after  the  table. 


66 


Table  3~5:  Predicted  Hectares  of  Soybean  (HB  and  BHF)  With  Corresponding 
Standard  Errors,  using  soybean  pixels  as  the  only  covariate 


STANDARD 

ERRORS 

County 

Sample 

Segments 

Predicted 

hectares 

(BHF) 

Predicted 

hectares 

(HB) 

BHF 

HB 

Sample 

mean 

Cerro  Gordo 

1 

52.78 

32.06 

22.51 

20.54 

29.1 

Hamilton 

1 

97.31 

100.88 

22.45 

20.54 

29.1 

Worth 

1 

98.04 

99.20 

22.45 

20.49 

29.1 

Humboldt 

2 

57.10 

44.70 

18.55 

15.80 

20.6 

Franklin 

3 

62.98 

57.11 

15.96 

13.32 

16.8 

Pocahontas 

3 

111.40 

114.86 

15.98 

13.32 

16.8 

Winnebago 

3 

88.75 

88.59 

16.03 

13.32 

16.8 

Wright 

3 

95.60 

96.64 

16.11 

13.32 

16.8 

Webster 

4 

107.20 

110.56 

14.34 

11.72 

14.6 

Hancock 

5 

113.37 

115.13 

12.97 

10.59 

13.0 

Kossuth 

5 

113.08 

116.93 

13.06 

10.60 

13.0 

Hardin 

5 

99.53 

89.69 

12.91 

10.58 

13.0 

In  the  following,  we  propose  to  give  a detailed  account  of  how  we  obtained  the 
figures  (for  the  predicted  hectares  and  their  standard  errors)  using  BHF’s  approach 
in  the  above  table. 

We  are  given  [Xij  - values  for  f = 1, 2, . . . , m and  j = 1,  2, . . . , n^]  as  also  population 
county  means  values  for  z = 1, 2, . . . , m. 

In  our  set-up,  we  have  modelled  Xij’s  as  X^j  = Xi  + rjij  = Hx  + Vi  + rjij  where  Vi’s 
are  iid  N{0,  al)  and  r/’s  are  iid  A''(0,  cTp).  Thus  it  is  a one-way  random  effects  model 
with  additional  information  on  the  population  means  for  each  county. 

An  estimator  of  is  given  by 

= (55Wx  + E(«r'  - - Xif)  /{m  + E(n,  - 1)). 

It  now  remains  to  estimate  jix  and  for  which  the  sufficient  statistics  are 
(Xi,  X2, . . . , Xm)  and  we  have  the  model  : 

F'(Xj)  — fXxi 

V{Xi)  = ct2  + a^Ni  = erf,  say 


67 


and  Cov.{Xi,  Xj)  — 0 iov  i j. 

Recall,  it  now  follows  that  we  have  the  following  two  equations  for  estimation  of 
and  : 

(I)  Mx  = where  al  + cx^/iVi  = 

(II)  m - 1 = E(^i  - 

Since  A^j’s  are  very  large,  we  make  the  approximation  : o“l  = a^.. 

Therefore,  we  obtain,  as  an  approximation, 

(b)  Ax  ^ip/ ^ 

(c)  = E(^i  - - !)• 

Once  these  are  estimated,  we  generate 
(i)  Xij  = Xp  + CTx^i  + ^V^ij 
(ii)  Xi  = Ep  + d„e*  + cFpe** !^{Ni) 
where  e*’s  and  e**’s  are  iid  77(0, 1). 

In  effect,  therefore,  we  have  generated  a new  data  set  for  the  given  covariates’ 
values  in  the  above  manner.  This  is  done  for  each  domain.  The  whole  process  is 
now  repeated  M times  for  a suitable  value  of  M so  that  we  have,  at  the  end,  M 
independent  data  sets  so  generated.  Of  course,  we  keep  the  same  response  (y) 
values. 

Now,  we  apply  BHF  method  of  data  analysis  using  one  covariate  and  arrive 
at  the  estimates  for  domain  means  and  estimated  standard  errors  for  each  such 
estimate. 

Finally,  we  combine  the  estimates  of  the  domain  means  by  averaging  over  the 
M-sets  and  these  give  the  final  results.  For  mean  square  errors,  we  use  the  formula 
MSE  = E(M5£'bhf), where  MSEbhf  is  the  expected  MSE,  conditional  on  the 
Xy’s.  This  term  comes  from  the  average  of  the  estimated  mean  square  errors  of  the 
estimates  from  M sets. 

Our  computations  yield 


68 


(i)  = 206.765 

(ii)  <72  = 527.88 

(iii)  6-^  = 1907.262  (and  hence,  is  at  most  4.75  which  justifies  the  approxi- 

mation). Further,  we  have  taken  M = 100. 

Next,  we  briefiy  discuss  the  computational  aspects  for  the  HB  predictors  and 
the  associated  standard  errors. 

We  set  all  the  parameters  with  initial  values  : normal  (0, 1)  for  the  regression 
coefficients  {bo  and  61)  and  6 and  inverse  gamma  for  all  the  variance  components 
(based  on  non-informative  priors  for  all  the  hyper-parameters).  Then  we  generate 
20,000  samples  based  on  the  conditionals  given  in  Section  3.4.  We  burn  out  the 
first  10,000  and  keep  the  last  10,000.  For  each  of  these  10,000  sets,  we  calculate  the 
HB  estimate  and  Bayes  risk  (given  in  Section  3.4)  and  take  the  average  to  obtain 
the  final  estimate  and  mean  square  error. 

It  follows  from  Table  3-5  that  the  HB  estimates  have  higher  precision  than  the 
BHF  estimates  in  all  the  counties  when  the  covariates  are  measured  with  error. 


CHAPTER  4 

EMPIRICAL  AND  HIERARCHICAL  BAYES  ESTIMATION  FOR  BINARY 

RESPONSE 

4.1  Introduction 

This  chapter  focuses  on  EB  and  HB  SAE  based  on  binary  data.  The  motiva- 
tion behind  this  work  comes  from  analyzing  a data  on  health  insurance  where  the 
purpose  is  to  estimate  the  proportion  of  individuals  without  health  insurance  in  a 
given  year  for  several  small  domains  (cross-classified  by  age,  sex  and  other  demo- 
graphic characteristics)  in  minority  subpopulations.  For  the  entire  US  population, 
the  direct  estimates  for  these  domains,  namely  the  sample  proportions,  are  fairly 
reliable  (since  the  sample  size  for  each  domain  is  reasonably  large).  However,  when 
our  analysis  is  targeted  towards  specific  subpopulations  such  as  Asians,  Hispanics 
and  other  similar  minority  sectors  of  the  community  where  the  sample  size  is  not 
high,  we  run  into  the  usual  small  area  problems  and  require  developing  indirect 
methods  of  estimation. 

We  employ  both  HB  and  EB  methodologies  to  obtain  estimates  of  these  propor- 
tions and  also  find  the  associated  measures  of  precision.  Results  are  derived  in  a 
general  setup  for  the  natural  exponential  family  with  quadratic  variance  functions 
and  then  specifically  derived  for  the  binary  case  (since  the  response  is  binary). 

The  outline  of  the  remaining  sections  is  as  follows.  Section  4.2  discusses  the 
general  HB  methodology  needed  for  obtaining  the  small  domain  estimates  and 
the  associated  measures  of  precision  for  a general  one-parameter  exponential 
family  of  densities.  In  Section  4.3,  we  discuss  the  alternative  methodology  based 
on  EB  estimation  discussed  in  details,  specifically  for  binary  data.  We  illustrate 


69 


70 


these  methods  in  Section  4.4  by  estimating  the  proportion  of  uninsured  in  several 
cross-sections  of  the  Asian  community. 

4.2  HB  Model 

We  begin  with  a general  one-parameter  exponential  family  model  given  by 

f ~ 6Xp [^j j { yj j V^(^tj)}  4” 

j = 1,  • • • , 'Uj,  i = 1,  • • • , /c.  Here  the  are  known,  and  are  assumed  to  be  1 
without  loss  of  generality.  Also,  E{yij\9ij)  = 'ip'{9ij)  = yij,  say,  and  V(yijl%)  = 
'ip"{9ij).  Since  V(yij|%)  is  positive,  is  a one-to-one  function  of  9ij. 

If  Uij  is  binary  with  success  probability  pij,  then  9ij  — logit(pjj).  If  yij  ~ 
Poisson(Ajj),  then  9ij  = log(Ay).  In  our  example,  y^  is  1 or  0 depending  on 
whether  the  person  does  not  or  does  have  health  insurance.  Also,  in  this  case, 
we  are  interested  in  the  estimation  of  the  pij,  the  proportion  of  people  without 
health  insurance.  In  this  section,  however,  we  discuss  how  to  carry  out  the  analysis 
for  the  general  hierarchical  Bayesian  model  when  we  are  interested  in  estimating 
fiiw  = where  Wij  is  the  weight  attached  to  the  jth  unit  in 

the  zth  small  domain.  We  will  write  Wij  = Wij/  Wij,  so  that  ~ ^ 

each  z = 1,  • • • ,k.  Specific  applications  will  be  considered  in  Sections  4 and  5. 

The  next  stage  of  the  model  is 

9ij  Xj^jb  “1“  Ui  “t“  I)  * ' ' 1 5 ^ ' j ^7 

where  Xij  are  p{<  /c)-component  design  vectors,  b is  the  vector  of  regression 
parameters,  the  Ui  are  the  random  effects,  in  our  example,  the  effects  of  the  small 
domains,  and  the  are  the  errors,  which  account  for  any  unexplained  source  of 
variability.  It  is  assumed  that  the  Ui  and  the  e^-  are  mutually  independent  with  Ui 
iid  N(0,a^),  and  e^-  iid  N{0,al).  Also,  let  = (iCii,---  ,Xki,--- 

and  assume  rank(X)  = p. 


71 


Finally,  it  is  assumed  that  b,  al  and  are  mutually  independent  with 
b ~ uniform(i?^),  ~ IG(c/2, d/2),  and  lG{g/2,h/2).  A random 

variable  Z is  said  to  have  an  IG(a,  /?)  distribution  if  it  has  a pdf  of  the  form 
f{z)  a exp(— 


The  propriety  of  the  posterior  follows  provided  0 < (Ghosh  et  al 

(1998))  and  h + S > p where  S = #{(t,  j)  : f /(yij|%)d6'ii  < oo}. 


This  is  a nonconjugate  Bayesian  analysis,  and  is  not  irnplementable  ana- 
lytically. Instead,  we  use  the  Markov  chain  Monte  Garlo  (MGMG)  numerical 
integration  technique.  In  particular,  we  employ  the  Gibbs  sampler.  To  this  end,  we 
need  to  find  the  full  conditionals  of  b,  u,  a\  and  o\. 

Writing  ■ ■ ■ ,UklnJ,  where  is  a y-dimensional  column  vector 

with  each  element  equal  to  one,  the  full  conditionals  are  given  by 


Let  y — (yil ! ■ ■ ■ ) 2/lni  J ■ ■ ■ > Ukl  > ■ ' ■ > ykn^  ) ) ^ (^11 ) ■ ■ ■ I ^Ini  i ' ' ‘ i ) 


u = {ui,  - ■ ■ , Ufc)^,  and  nr  = rii.  Then  the  joint  posterior  is  given  by 


k rii 


'K{e,b,u,(jl,al\y)  oc  J|n/(yij|%) 


72 


u. 


\e,b,(rl,al,y^'^  (riitj,  ^ ^ i); 

j=i 


%16,  u,  cr^,  £7^,  y /(ytj|%)exp[-^(%  - xj^b  - Uif]. 

Our  data  analysis  is  based  on  generating  samples  from  the  above  condition- 
als specialized  to  the  binary  case.  Generation  of  samples  from  the  condi- 
tionals of  al  and  b is  standard.  This  is  not  so  for  the  %,  and  requires  the 
Metropolis-Hastings  algorithm.  If  denotes  the  sampled  value  of  gener- 
ated from  the  rth  draw,  and  the  number  of  draws  is  R,  then  the  Monte  Carlo 
estimate  of  E{yij\y)  is  R~^  Similarly,  the  Monte-Carlo  estimate  of 

var(/i,j|r/)  is  7?“^  Ef=i  Finally,  Monte-Carlo  estimate  of 

cov(/rij,^i/jv)|y)  is  given  by  Ef=i(Alj^ASq')  - Ef=i  Eii  AS')- 


These  estimates  are  then  utilized  to  estimate  E{p.iw\y)  and  V(/2™|y). 

4.3  EB  estimation 

Recall  that  yij  is  the  response  of  the  jth  unit  in  the  ith  small  domain  (j  = 

1,  • • • = 1,  • • ■ ,k).  We  assume,  once  again,  that  yij  has  a probability  function 

(or  a probability  density  function)  belonging  to  the  natural  exponential  family,  i.e. 


where  the  are  assumed  to  be  1.  However,  here  we  assume  in  addition  that 
V(yp|6*ij)  = I'D  + viy^ij  + V2y?ij  = Q{lJ.ij),  say,  where  vq,  vi  and  V2  are  not 
simultaneously  zero,  i.e.  the  variance  is  at  most  a quadratic  function  of  the  mean. 
This  family  of  distributions  is  usually  referred  to  as  the  natural  exponential  family 
quadratic  variance  function  (NEF-QVF)  family  of  distributions.  Morris  (1982, 
1983)  characterized  distributions  belonging  to  the  NEF-QVF  family.  These  are  the 
(i)  binomial,  (ii)  Poisson,  (hi)  normal  with  known  variance,  (iv)  negative  binomial, 
(v)  gamma  and  (vi)  generalized  hyperbolic  secant.  For  the  binomial  distribution, 
tiQ  = 0,  ui  = 1 and  V2  = —I-  For  the  Poisson  distribution,  vq  = V2  = 0 and 


73 


= 1.  For  the  normal  distribution  with  known  variance  = o'  uq  = 1 and 

Vi  =V2  = 0. 

We  assume  that  the  survey  weights  Wij  are  independent  of  the  y^j  so  that  they 
are  fixed  numbers  given  the  sample.  Our  objective  is  to  estimate  the  weighted 
small  domain  means  fiiw  = WijUij,  i = 1,  - ■ ■ ,k.  The  direct  unbiased  estimator 
of  is  given  by  However,  as  noted  earlier,  for  many  of  these 

domains,  the  sample  sizes  are  so  small  that  these  unbiased  estimators  are  subject  to 
large  standard  errors  and  coefficients  of  variation. 

We  propose  instead  EB  estimators  of  the  small  domain  means.  To  this 
end,  we  begin  with  the  general  NEF-QVF  family  of  distributions  along  with  a 
conjugate  prior  for  the  canonical  parameter  of  the  exponential  model.  Together 
they  constitute  an  overdispersed  NEF-QVF  family  of  distributions.  Specifically,  we 
consider  the  conjugate  prior  with  pdf 


TT{0ij)  = exp[X{mij9ij  - '0(%)}  + 

where  = g{xj^b),  j = 1,- ■■  ,ni]i  = I,- ■■  ,k.  Here  Xij  is  the  design  vector 
associated  with  the  jth  unit  in  the  zth  small  domain,  and  g is  the  link  function. 
Then  (Morris,  1983), 

E{gij)  — rUjj,  V(y.jj)  / {X  '^^2)1 


where  we  recall  the  definition  of  Q as  given  earlier,  and  assume  that  A > 
max(0, U2).  Since  V(/ry)  is  strictly  decreasing  in  A,  we  may  interpret  the  lat- 
ter as  the  precision  parameter.  We  will  also  see  later  that  A acts  as  the  tuning 
parameter  for  the  EB  estimators. 

We  first  obtain  the  Bayes  estimator  of  fliw  This  is  given  by  (Morris,  1983) 


A + 1 


Vij  + 


A + 1 


74 


The  above  can  also  be  viewed  as  the  best  linear  unbiased  predictor  (BLUP)  of  jiiw. 
To  see  this,  we  calculate 


E{yij)  = E{yij)  = = Q{mij)/{X  - V2)] 

A — V2 

Hence,  the  BLUP  of  is  ~ ^^,(6))  = E{fHj\yij) 

In  practice,  however,  b and  A are  unknown,  and  need  to  be  estimated  from 
the  marginals  of  the  However,  except  for  the  normal  distribution,  these 
marginals  are  fairly  complicated,  and  finding  MLE’s  from  the  marginal  likelihoods 
can  become  quite  formidable.  Instead,  we  find  estimates  based  on  some  optimal 
unbiased  estimating  equations  (Godambe  and  Thompson,  1989)  which  requires  only 
evaluation  of  the  first  four  moments  of  these  marginals. 

To  this  end,  we  begin  with  the  the  elementary  unbiased  estimating  functions 
9iij  = y^j  - m,j  and  ysij  = {y^j  - mj?  - In  order  to  construct  the 

optimal  estimating  equations,  let 


_ pi  ( 9g2i,i 

db 


) 

) 


Also,  let 


Eljj  — 


t^Sij 

2 

yZij  94ij  ~ y2ij 


where  firij  = E{yij  — rriijY  is  the  rth  central  moment  of  yij  based  on  its 
marginal  distribution.  The  optimal  estimating  equations  are  then  given  by 
EtiE”=i  = 0,  where  = {guj  92^jf■  We  obtain  estimates  of  b 

and  A (if  they  exist)  by  solving  these  equations.  The  solutions  of  these  equations 
are  found  by  the  Nelder-Meade  algorithm. 


75 


Unfortunately,  the  above  method  fails  for  binary  data.  In  this  case,  V2  = —1  so 
that  var(yjj)  does  not  depend  on  A.  Indeed,  the  marginal  beta-binary  distributions 
of  the  Uij  are  unidentifiable  in  A.  A simple  way  to  verify  this  is  that  if  y|p  ~ 
Bin(l,p),  and  p ~ Beta(Am,  A(1  - rn)),  then  E{y)  = E{p)  = m,  and  a binary 
distribution  is  completely  characterized  by  its  mean.  The  problem  does  not  occur 
for  a Binomial(n,p)  distribution,  with  n > 2,  since  with  the  same  marginal  for  p, 
the  mgf  of  the  marginal  distribution  of  the  binomial  y is  E[{pexp{t)  + 1 — p)"]  which 
depends  on  A. 

For  binary  y^,  ^ ^ = 0 so  that  the  second  element  of  the  vector 

is  zero.  Accordingly,  the  proposed  estimating  equations 
approach  fails  to  estimate  A.  The  basic  data,  to  be  considered  in  the  next  two 
sections,  is  binary, and  this  necessitates  modification  of  the  proposed  procedure. 

We  have  thus  considered  the  optimal  estimating  function 


different  estimators  for  different  choices  of  A.  It  may  be  noted  also  that  in  this 
case  V(yij)  = V{mij)  = mij{l  - m^).  Further,  with  the  logistic  representation. 


Accordingly,  the  EB  estimator  of  = Ej=i  is  = Ej=i 

Next,  in  this  section,  we  find  the  mean  squared  errors  (MSE)  and  also  the 
estimated  MSE’s  of  Jlf^- 


k rii 


dh 


since 


mij{h)  = exp(x^b)/[l  + exp(a;^6)],  - rriij)xij.  Thus  b is  estimated 

from  the  estimating  equations  E^=i  Ej=i  ^ijVij  = Et=i  Ej=i  Xijrriij.  Denoting  the 
resulting  estimator  by  6,  an  EB  estimator  of  pij  is  given  by 


76 


Theorem  4.3.1  An  approximate  expression  for  MSE{pf^)  which  is  correct  upto 
0(n“^)  is  given  by 

A 


+ 


(A + 1)2 

xS-^(b) 


^ ] iVijUiij  (t) (1  rHij (6) )sjj 


1=1 


'^Wij'mij{b){l  - rriij{b))x, 
. i=i 


Proof: 


MSEifii^)  =E{ftg^  - = e[Y^  w,,(pf/  - p,,) 

\l=i  / 

=e('^  Wij(p§^  - pfj  + pf^  - Pij)] 


.1=1 


=E  ^Wij(p: 


'f -P^) 


+ E ( ^w,j(pfj  - Pij) 


.1=1 


.1=1 


+2E 


^ -Pfj)]  ( ^ijipfj  - Pij) 


.1=1 


.1=1 


Noting  that  £^(py|data)  =pfj, 


Tli 


^ijipfj^  - pfj)  ^^^^pfj  ~ P^i) 


. \l=i 
m 


.1=1 


=E 


'^ij(pfj^  - Al)  I ^ ^ ^i^^pfj  ~ Pij)\data\ 


.1=1 


.1=1 


= 0. 


Hence, 


MSEififJ^)  = e[Y^  Wi,{pf  -pf^)]  +E[Y^  Wi.ipf^  - p,,) 


.1=1 


.1=1 


But 


E 5^  Wijipfj  - p,j)  = - Pij)^- 

\l=i  / 1=1 


(4.1) 


77 


Next  we  calculate 
E {pfj  - Piif  ^E 
=E 


T 2 


1 


- Pa)  + j^{i^ij{b)  - Pij) 


1 

Eiyij  - P^jf  + rp^E  {rriijih)  - pijf 


(A + 1)2 

2A 


+ 


(A + 1)2 

1 


(A + 1)2 

1 / A 


(A+1)^ 

Eiy^j  - Pij){m,j(b)  -pij) 

E{Pij{l  - Pij))  + + 0 


■(A  + l)2  V(A  + 1)""^ 

\mij{b){l  - mij{b)) 

(A + 1)2 


(A  + 1)^ 


rriij{b){l  - rriijib))  + 


A^  f 'mij{b){l  - mij{b)) 


(A  + 1)^ 


A + 1 


so  that 


E 


n 2 


'^Wijipfj  - Pij) 

L 1=1 


— £ ^ij^ij{b)i^  - rriij{b)).  (4.2) 

1 7 = 1 


Finally,  we  calculate, 


E 


T 2 


^Wij{pf^  -pf^ 


I 3 = 1 

A^ 


(A + 1)2 

A^ 

^(A  + l)2 


E 


E 


Wij  [rn,j{b)  - TTii 

. 1=1 


- 1=1 


n 2 


+ 


X]  Y1  {^iji^)  - ^iji^))  {^ij'ib)  - rriij, 


l<l7^1'<"i 


(4.3) 


By  two-step  Taylor  expansion, 

rriij{b)  = rriij{b)  + ^ (S  - b)  + ^(b  - (^  - &)■ 

Noting  that  = (1  “ 2mij(b))  mjj(b)  (1  - rriij{b))  Xijxfj, 


78 


it  follows  that 


E 


=E 


2 


m,j{b)  - mij{b) 
rriijib)  (1  - mij{b))  xjj(b  - b) 


n 2 


+^(b  - bf'mij{b)  (1  - mij{b))  (1  - 2mij{b))  XijxJj{b  ~ b) 


=ml{b)  {I  - mij{b)f  E 


X 


jAb-b)  + -{l-  2mij{b))  {b  - bf  x,jxJ^{b  - b) 


(4.4) 


The  first  neglected  term  is  Op{\\b  — b\\^).  From  Sarkar  and  Ghosh  (1998),  b — b 
is  asymptotically  N(0,Xl“^(b))  , where  S(b)  = — M)X,  X'^  = 

(xii , . . . , Xiji^ , . . . , , . . . , ) and  iVT  Diag(77Tn , . . . , ? • ■ • i ^ki  i • • • » '^knk ) • 

With  the  customary  assumption  ^S(6)  = 0(1),  it  follows  that  S“^(h)  = 0{n^^). 
Thus,  ||£»  — 6||  = 0(n^^^^).  Hence,  the  first  neglected  term  is  Op{rC^^‘^).  Next,  we 
observe  that 


E 


1 

1 

to 

II 

tq 

(b  - h)  XijxJj  (b  - b) 

= tr 

_ 

L 

XijxfjE{b  - b){b  - bf 


(4.5) 


In  order  to  find  E 


{b-b){b-by 


we  proceed  as  follows: 


Let  T{b)  = YAjU  that  T{b)  = 0. 


79 


By  one-step  Taylor  expansion,  0 = T{b)  = T{b)  -1-  [VT(6)]^(S  — b)  + Op{rij,^), 
where 

VT(6)=-ti;(^)-5 

1=1  j=l  ^ / 


k Tii 


= -^Y^  - m^j{b))x,^x 

i=l  j=l 

= - X'^M{I  - M)X 
= -S(6). 


(4.6) 


Thus,  b - b = T,  ^T{b)  + Op{rij,^).  Since  V{yij)  = mij{b){l  - mij{b)),  V (T(6))  = 


S(6).  Hence  E 


{h-b){h-ty 


= S ^(6)  + 0{rirp^^'^).  Going  back  to  4.5,  we 

have  E [xjj{b  - 6)j  = tr  [xijxJj'E~^ (b))  -h  Op{n^^).  Accordingly,  by  4.4  and  4.5, 
we  have  the  approximations 


E 


1 2 


mij{b)  - rriij{b) 


= - mij{b)fxl'E  \b)x^ 


(4.7) 


-I'l 


which  is  correct  up  to  0(n 
Note  that  the  neglected  term  E 

since 


X, 


]{b  - 6)(1  - 2rriij{b)){b  - b)'^XijxJ^{b  - b) 


xjAb  - 5)(1  - 2mij{b)){b  - b)^Xijxfj{b  - b) 


= (1  - 2rriij{b))E 

= (1  - 2mij{b))E 

= 0{rCy?^'^) 


xfAb  - b){b  - bfxi 


3 


xlj{b-b) 


Similarly,  note  for  the  other  neglected  term  that  E 


1 4 


xjAb-b) 


= 0(jlrp^). 


80 


Similarly,  we  find 


E 


=mij{b){l  - mij{b))m,j>{b){l  - mij^{b))xJ^'E  \b)xij>  + 0{rij?^‘^).  (4.8) 


This  leads  to 
E - pfj) 

- j=i 


(A  + 1)^ 


\b)xij 


L j=i 


+ ^ ^ WijWifmij{b){l  - mij{b))mij>{b){l  - mij^{b))xjji:  ^{b)xif 

i<j¥ 

A2 


(A + 1)2 
xS-^(5) 


n T 


'^Wijrriij{b){l  - rriij{b)) 


Xi 


L j=i 


Wijrriij{b){l  - m,j{b)) 
j=i 


X 


+ 0{n-^'^).  (4.9) 


Since  S ^{b)  = O(ny^),  the  first  term  is  0{rij}).  The  theorem  follows  now  from 
(4.2)  and  (4.9). 

We  now  turn  to  estimation  of  the  MSE  which  is  correct  up  to  0(n^^).  The 


following  theorem  is  proved. 


81 


Theorem  4.3.2  The  following  approximation  to  the  MSE  holds  correct  up  to 
A 


(1  + A)^ 


- rriijih)) 


-(1  - 2rnij{b))mij{b){l  - mij{b))^T,  ^{b) 


tr[j:-\b)K,ib) 


\ 


tr 


- rriijib))  {b)x^j 


+ 


(A  + 1)^ 


/ 


^^Wijmij{b){l  - mij{b))xi 

i=i 


xS-^(6) 


- ruijib)) 

L j=i 


Xi 


Proof:  We  first  note  that  b = 6 + Op(ny^)  and  S ^{b)  = 0(n^^).  Hence,  the  second 
term  in  the  right  hand  side  of  4.7  is  approximated  by 

(4.10) 


Tli 

T 

7li 

^Wijmij(b)(l  - rriij{b))xij 

^-\b) 

Wijmij(b){l  - mij{b))xij 

. i=i 

- i=i 

(c  = (1+^)  which  is  correct  up  to  0{nf^). 

However,  if  we  estimate  mjj(6)(l  — mij{b))  simply  by  mij(b){l  — rriij{b)),  we  will 
be  ignoring  the  O(nf^)  term.  Thus,  we  need  a careful  approximation  of  the  bias 
E(b  — b)  to  achieve  the  desired  approximation.  To  this  end,  we  follow  Cox  and 
Snell  (1968). 


82 


We  begin  with  the  identity 


E 


mij{h)  - mij{b)^  =E  (rriij{b)  + mij{b)  - mij{b)^  (^1  - mij{b)  + 771^(6)  - 


=rriij{b)  (1  - mij{b))  + (1  - 2mij{b))  E[mij{b)  - (b)] 

-E[rriij{b)  - rriij{b)f. 


Now,  again  by  a two-step  Taylor  expansion, 

T 


rriijib)  - m.,j{b) 


dmij{b) 

db 

E{b-b)  + -E 


In  order  to  find  E{b  — b),  we  proceed  as  follows. 
We  begin  with  the  second  order  Taylor  expansion 


{b-b) 


dbdb^ 


{b-b) 


+ 0{rij.^^‘^) 


p 


0 = %{b)=Tr{b)  + J2(bs-bs) 


dTr{b)  1 


p p 


dL 


3/2x 


s=l  ” s=l  t=l 

Taking  expectations  and  following  Cox  and  Snell  (1968), 

p 


s^<Jt 


0 = E{Tr{b))=Y, 


S=1 


E{b,  - b,)E  ( + Con  (k-  bs, 


(96., 


dbs 


S=1  t=l 


1 


p p 


+9EEC"" 


S=1  t=l 

p 


{k-bs){bt-bt),{ 


d^Trjb). 

dbsdbt 


p p 


- Y,  E{k  - b,)ars  + YY.  Cov 


5=1 
P P 


s=l  u=l 


a^“(6)T„(b), 


-h  0{rirj?^‘^) 

dTr{b) 


db. 


S=1  t=l 


d^Trjb) 

dbsdbt 


-h  0{n 


-3/29 


(4.11) 


Note  Cov 
Vij- 

Similarly,  Cov 


a-“[b)T^{b), 


= 0.  since  is  a constant  independent  of  the 


{k-b.){k-h),{^) 


= 0 


83 


Also,  let 


A"rsf  —E 


1. 

'dbt 


d'^Trjb) 

dbsdbt 

k Tii 


jXijr^ijs 


Thus,  one  has 


- mij{b)y^ 

1=1  j=i 

k rii 

— ^ ^ ^ ^(1  (h) Tflijib'yjX'ijj’XijsXiji, 

i=l  j=l 


k k p 

^ arsE{bs  -bs)  = (r^'^Krsu  r = l,...,p. 


s=l 


s=l  f=l 


In  matrix  notations,  one  gets 


^E{b-b)  = - 


where  = {{Krst))- 
Hence, 


V y 


/ tr(S-^Ki)  \ 


E{b-b)  = ^S-' 


(4.12) 


The  theorem  follows. 

4.4  Data  analysis 

We  use  a data  provided  by  the  National  Center  for  Health  Statistics  (NCHS) 


to  estimate  the  proportion  of  uninsured  persons  in  a minority  subpopulation. 


84 


The  data  is  given  for  the  years  1997,  1998,  1999  and  2000.  The  original  survey 
for  any  given  year  contains  data  on  more  than  100,000  individuals  and  on  over 
800  variables.  Of  these  individuals,  we  have  information  on  the  primary  response 
variable,  namely  whether  a person  has  health  insurance  or  not.  In  addition, 
there  is  information  on  demographic  characteristics  such  as  age,  sex,  race,  region, 
education,  income  status,  medical  condition,  disability  conditions  (if  any)  and  many 
other  socio-economic  factors. 

Appropriate  selection  of  covariates  is  an  important  first  step  towards  sensible 
data  analysis  and  it  is  specially  important  here  since  there  are  so  many  variables 
to  choose  from.  In  the  next  subsection,  we  will  discuss  the  process  of  selection 
of  covariates.  Once  we  have  these  (and  the  response),  we  employ  both  EB  and 
HB  methods  discussed  in  the  previous  sections  to  obtain  the  estimates.  Also, 
we  provide  the  posterior  standard  deviations  of  the  the  HB  estimators  and  the 
asymptotic  estimated  mean  squared  errors  of  the  EB  estimators. 

4.4.1  Selection  of  covariates 

In  our  analysis,  we  have  constructed  several  domains,  for  each  year,  by  cross- 
classifying with  respect  to  age,  sex,  race  and  region. 

The  first  concern  is  which  covariates  to  select  and  keep  for  eventual  analysis. 
Obviously,  the  inclusion  of  all  these  covariates  is  impractical  and  unnecessary.  We 
started  with  a set  of  6 covariates  (that  we  thought  were  the  most  relevant)  and  an 
initial  model  and  after  a process  of  forward  and  backward  selection  finally  chose  the 
best  model  with  minimum  number  of  covariates. 

Initially  we  started  with  the  following  covariates:  (1)  legal  marital  status, 

(2)  family  size,  (3)  education  level,  (4)  total  earning  from  previous  year,  (5)  total 
family  income,  and  (6)  full  time  working  status. 

Nearly  two-thirds  of  the  data  on  full  time  working  status  were  missing  for 
these  years.  Hence,  this  covariate  was  dropped  immediately  for  model  selection. 


85 


Also,  legal  marital  status  and  total  earning  from  previous  year  were  both  found 
insignificant.  Thus,  along  with  the  intercept  term,  the  selected  covariates  are  family 
size,  education  level,  and  total  family  income. 

We  use  SAS  Versions  for  the  initial  computations.  Also,  we  use  the  SUR- 
VEYREG  Procedure  for  model  selection.  This  is  a relatively  new  procedure  that 
adjusts  for  stratified  sampling. 

4.4.2  Small  Domain  Estimates  for  Asians 

The  Asian  group  is  formally  composed  of  the  (1)  Chinese,  (2)  Filipino,  (3) 

Asian  Indian,  and  (4)  Islanders  such  as  Koreans,  Vietnamese,  Japanese,  Hawaiian, 
Samoan,  Guamanian  etc.  These  individuals  are  assigned  to  specific  domains 
depending  on  their  age,  race,  gender  and  the  region  they  come  from.  There  are  3 
age-groups  (0-17,  18-64  and  65-I-),  2 Genders,  4 Races  and  4 Regions  depending  on 
the  size  of  the  Metropolitan  Statistical  Area  (<  499,999;  500, 000-999, 999;1,000, 000- 
2,499,999,>  2,500,000). 

We  first  describe  how  the  small  domains  are  constructed.  Consider  the  4-tuple 
{ki,  /c2,  k^,  ki),  where  k\  = 1, 2, 3 or  4 according  as  the  person  is  Chinese,  Filipino, 
Asian  Indian  or  Islanders.  Next  /c2  = 1 or  2 according  as  the  person  is  a male  or 
a female.  Then  k^  = 1, 2 or  3 according  as  the  person  belongs  to  the  age- group 
0-17,  18-64  or  65-I-.  Finally,  k^  = 1,  2, 3 or  4 according  as  the  person  belongs  to 
a Metropolitan  Statistical  Area  (MSA)  of  size  < 499, 999,  500, 000  — 999, 999, 
1,000,000  — 2,499,999  or  > 2,500,000.  A small  domain  is  now  numbered  by 
the  formula  24(/ci  - 1)  -I-  12(A:2  - 1)  + 4(/c3  - 1)  + k^  corresponding  to  the  4- 
tuple  (fci,  k2,  ks,  k4).  For  example,  the  small  domain  consisting  of  Filipino  females 
belonging  to  the  age-group  18-64  and  a MSA  of  size  500,  000  — 999, 999  is  numbered 
42. 

Thus,  the  total  number  of  domains  equals  3x2x4x4  = 96.  When  the 
individuals  are  distrtibuted  to  their  respective  domains,  it  turns  out  that  many  of 


86 


the  domains  contain  only  a few  samples.  Indeed,  there  are  several  domains  with  a 
sample  of  size  1,  while  domain  58  has  sample  size  zero. 

The  basic  data  consist  of  = 1 or  0 if  the  jth  individual  in  the  ith  small 
domain  does  not  (does)  have  health  insurance; 

Wij  = the  sampling  weight  attached  to  the  jth  unit  in  the  fth  small  domain; 

Wij  = Wij/ '^ij  so  that  YTj=i  ~ ^ *• 

Xiji  — the  family  size  of  the  jth  unit  in  the  zth  small  domain; 

Xij2  = the  education  level  of  the  jth  unit  in  the  ith.  small  domain; 

= total  family  income  of  the  jth  unit  in  the  ith  small  domain; 

Let  pij  = E{yij). 

For  the  HB  analysis,  we  model 

Bij  = logit(pij)  = 6o  + hiXiji  + b2Xij2  + hx^^  + u,,  j = 1,  • • • , n^,  i = 1,  • • • ,96. 

The  direct  domain  estimates  are  given  by  WijTjij.  The  corresponding 

HB  estimates  are  given  by  pF-®  = WijE{pij\y).  We  use  MCMC  as  described 
in  the  previous  section  to  obtain  these  estimates.  Our  hyperprior  considers: 
c = .2,  .02,  .002;  d = .2,  .02,  .002.  The  results  are  very  insensitive  to  the  choice  of 
the  hyperpriors,  and  are  reported  only  for  c = d = .02.  We  have  reported  also  the 
standard  errors  related  to  the  HB  method.  In  addition,  we  have  EB  estimators  for 
different  choices  of  the  tuning  parameter  A.  The  results  are  reported  for  A = .5  and 
A = 1. 

The  tables  (4.4  — 4.15)  provide  small  area  estimates  of  uninsured  Asian  people 
for  the  different  small  domains  in  the  years  1997,  1998,  1999  and  2000  respectively. 
Domain  58  is  excluded  for  1997  due  to  zero  sample  size.  For  the  same  reason,  we 
exclude  domains  21,  58  and  70  in  the  year  1998,  domain  69  in  1999  and  domain 
2 in  the  year  2000.  Domain  58  refers  to  male  Asian  Indians  in  the  age  group  65+ 
belonging  to  MSAs  of  size  500, 000  - 999, 999.  The  measures  of  precision  (posterior 


87 


s.d.’s)  associated  with  the  weighted  HB  estimates  are  denoted  by  se(HB)  and  are 
given  by  the  formula  !?/)■  Also,  we  provide  approximate 

MSE  for  the  EB  estimators.  One  of  the  advantages  of  the  HB  or  EB  estimates  is 
that  for  domains  with  very  small  sample  sizes,  often  the  direct  estimates  of  the 
proportion  of  uninsured  is  zero,  whereas  the  former  provide  small  but  non-zero 
estimates.  We  note  also  that  when  A = .5,  i.e.  equal  weight  is  attached  to  both  the 
direct  and  synthetic  estimates,  the  EB  and  HB  estimates  are  real  close. 


Note  : Variable  Definition: 

(i)  Asian  Subgroup 

1 Chinese 

2 Filipino 

3 Asian  Indian 

4 Asian  Pacific  Islanders 

(ii)  Gender 

1 Male 

2 Female 
(hi)  Age  Groups 

1 Under  17  years 

2 18-64  years 

3 Above  65  years 
(iv)  MSASIZE 

1 Under  499,999 

2 500,000-999,999 

3 1,000,000-2,499,999 

4 Above  2,500,000 


89 


Table  4-1:  Definition  of  Domains  for  Asians 


Domain 

Asian  Subgroup 

Gender 

Age  Group 

MSA  Size 

1 

1 

1 

1 

1 

2 

1 

1 

1 

2 

3 

1 

1 

1 

3 

4 

1 

1 

1 

4 

5 

1 

1 

2 

1 

6 

1 

1 

2 

2 

7 

1 

1 

2 

3 

8 

1 

1 

2 

4 

9 

1 

1 

3 

1 

10 

1 

1 

3 

2 

11 

1 

1 

3 

3 

12 

1 

1 

3 

4 

13 

1 

2 

1 

1 

14 

1 

2 

1 

2 

15 

1 

2 

1 

3 

16 

1 

2 

1 

4 

17 

1 

2 

2 

1 

18 

1 

2 

2 

2 

19 

1 

2 

2 

3 

20 

1 

2 

2 

4 

21 

1 

2 

3 

1 

22 

1 

2 

3 

2 

23 

1 

2 

3 

3 

24 

1 

2 

3 

4 

25 

2 

1 

1 

1 

26 

2 

1 

1 

2 

27 

2 

1 

1 

3 

28 

2 

1 

1 

4 

29 

2 

1 

2 

1 

30 

2 

1 

2 

2 

31 

2 

1 

2 

3 

32 

2 

1 

2 

4 

33 

2 

1 

3 

1 

34 

2 

1 

3 

2 

35 

2 

1 

3 

3 

90 


Table  4-2;  Definition  of  Domains  for  Asians 


Domain 

Asian  Subgroup 

Gender 

Age  Group 

MSA  Size 

36 

2 

1 

3 

4 

37 

2 

2 

1 

1 

38 

2 

2 

1 

2 

39 

2 

2 

1 

3 

40 

2 

2 

1 

4 

41 

2 

2 

2 

1 

42 

2 

2 

2 

2 

43 

2 

2 

2 

3 

44 

2 

2 

2 

4 

45 

2 

2 

3 

1 

46 

2 

2 

3 

2 

47 

2 

2 

3 

3 

48 

2 

2 

3 

4 

49 

3 

1 

1 

1 

50 

3 

1 

1 

2 

51 

3 

1 

1 

3 

52 

3 

1 

1 

4 

53 

3 

1 

2 

1 

54 

3 

1 

2 

2 

55 

3 

1 

2 

3 

56 

3 

1 

2 

4 

57 

3 

1 

3 

1 

58 

3 

1 

3 

2 

59 

3 

1 

3 

3 

60 

3 

1 

3 

4 

61 

3 

2 

1 

1 

62 

3 

2 

1 

2 

63 

3 

2 

1 

3 

64 

3 

2 

1 

4 

65 

3 

2 

2 

1 

66 

3 

2 

2 

2 

67 

3 

2 

2 

3 

68 

3 

2 

2 

4 

69 

3 

2 

3 

1 

70 

3 

2 

3 

2 

91 


Table  4-3:  Definition  of  Domains  for  Asians 


Domain 

Asian  Subgroup 

Gender 

Age  Group 

MSA  Size 

71 

3 

2 

3 

3 

72 

3 

2 

3 

4 

73 

4 

1 

1 

1 

74 

4 

1 

1 

2 

75 

4 

1 

1 

3 

76 

4 

1 

1 

4 

77 

4 

1 

2 

1 

78 

4 

1 

2 

2 

79 

4 

1 

2 

3 

80 

4 

1 

2 

4 

81 

4 

1 

3 

1 

82 

4 

1 

3 

2 

83 

4 

1 

3 

3 

84 

4 

1 

3 

4 

85 

4 

2 

1 

1 

86 

4 

2 

1 

2 

87 

4 

2 

1 

3 

88 

4 

2 

1 

4 

89 

4 

2 

2 

1 

90 

4 

2 

2 

2 

91 

4 

2 

2 

3 

92 

4 

2 

2 

4 

93 

4 

2 

3 

1 

94 

4 

2 

3 

2 

95 

4 

2 

3 

3 

96 

4 

2 

3 

4 

92 


Table  4-4:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1997 


Domain 

ni 

Direct 

HB 

se(HB) 

EB 

A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A = 1 

1 

11 

.103 

.121 

.045 

.120 

.128 

.054 

.057 

2 

8 

.088 

.092 

.044 

.094 

.096 

.052 

.055 

3 

36 

.000 

.047 

.033 

.044 

.066 

.030 

.030 

4 

23 

.194 

.178 

.037 

.172 

.161 

.040 

.038 

5 

29 

.382 

.319 

.051 

.339 

.318 

.035 

.037 

6 

18 

.086 

.109 

.037 

.121 

.139 

.041 

.043 

7 

96 

.156 

.159 

.017 

.164 

.167 

.019 

.020 

8 

58 

.323 

.272 

.037 

.280 

.258 

.024 

.025 

9 

4 

.000 

.049 

.067 

.056 

.084 

.084 

.089 

10 

1 

.000 

.054 

.133 

.057 

.086 

.170 

.179 

11 

14 

.000 

.059 

.050 

.073 

.109 

.050 

.054 

12 

6 

.000 

.063 

.066 

.089 

.134 

.075 

.079 

13 

10 

.038 

.081 

.051 

.080 

.101 

.060 

.063 

14 

5 

.163 

.148 

.071 

.156 

.152 

.073 

.077 

15 

38 

.179 

.168 

.028 

.173 

.169 

.028 

.030 

16 

21 

.295 

.242 

.048 

.242 

.215 

.036 

.038 

17 

31 

.263 

.239 

.036 

.255 

.252 

.034 

.037 

18 

20 

.164 

.163 

.036 

.171 

.175 

.040 

.042 

19 

103 

.131 

.142 

.017 

.147 

.156 

.018 

.019 

20 

66 

.303 

.261 

.034 

.264 

.245 

.023 

.025 

21 

1 

.000 

.039 

.111 

.053 

.080 

.149 

.158 

22 

2 

.553 

.442 

.167 

.468 

.425 

.132 

.140 

23 

11 

.000 

.064 

.057 

.080 

.119 

.060 

.063 

24 

7 

.000 

.075 

.072 

.101 

.151 

.077 

.081 

25 

31 

.082 

.104 

.030 

.092 

.098 

.031 

.033 

26 

12 

.000 

.047 

.045 

.044 

.066 

.049 

.052 

27 

34 

.022 

.058 

.028 

.051 

.066 

.027 

.029 

28 

36 

.050 

.075 

.026 

.066 

.075 

.027 

.028 

29 

55 

.284 

.244 

.034 

.241 

.219 

.026 

.027 

30 

25 

.000 

.051 

.040 

.046 

.069 

.038 

.040 

31 

67 

.087 

.103 

.020 

.101 

.108 

.021 

.023 

32 

50 

.132 

.138 

.022 

.135 

.137 

.024 

.026 

93 


Table  4-5:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1997 
(continued) 


Domain 

rii 

Direct 

HB 

se(HB) 

EB 

A = .5 

EB 
A = 1 

se(EB) 
A=  .5 

se(EB) 
A = 1 

33 

4 

.000 

.044 

.062 

.040 

.060 

.078 

.083 

34 

5 

.000 

.058 

.070 

.052 

.078 

.082 

.087 

35 

4 

.000 

.045 

.064 

.030 

.045 

.080 

.085 

36 

6 

.000 

.047 

.057 

.058 

.087 

.067 

.071 

37 

33 

.164 

.165 

.030 

.163 

.163 

.032 

.035 

38 

13 

.057 

.094 

.045 

.085 

.099 

.050 

.053 

39 

27 

.000 

.041 

.033 

.038 

.057 

.032 

.034 

40 

24 

.000 

.044 

.034 

.030 

.045 

.033 

.035 

41 

64 

.186 

.184 

.023 

.183 

.182 

.024 

.025 

42 

34 

.108 

.127 

.028 

.122 

.128 

.031 

.033 

43 

83 

.083 

.103 

.019 

.099 

.107 

.019 

.020 

44 

70 

.185 

.172 

.023 

.178 

.175 

.021 

.022 

45 

6 

.000 

.057 

.063 

.066 

.099 

.072 

.077 

46 

6 

.000 

.059 

.064 

.058 

.086 

.074 

.079 

47 

6 

.125 

.155 

.071 

.179 

.207 

.080 

.085 

48 

5 

.000 

.056 

.068 

.067 

.100 

.080 

.085 

49 

10 

.392 

.308 

.078 

.312 

.273 

.053 

.056 

50 

7 

.000 

.044 

.050 

.043 

.065 

.059 

.062 

51 

17 

.093 

.110 

.036 

.104 

.110 

.041 

.044 

52 

23 

.154 

.160 

.034 

.153 

.153 

.038 

.040 

53 

23 

.265 

.237 

.041 

.247 

.238 

.039 

.041 

54 

38 

.160 

.162 

.028 

.179 

.189 

.030 

.032 

55 

37 

.080 

.104 

.028 

.112 

.127 

.029 

.031 

56 

66 

.399 

.326 

.048 

.332 

.298 

.023 

.025 

57 

1 

.000 

.050 

.127 

.044 

.066 

.167 

.177 

58 

0 

- 

- 

- 

- 

- 

- 

- 

59 

2 

.587 

.424 

.194 

.424 

.343 

.112 

.119 

60 

1 

.000 

.093 

.180 

.062 

.092 

.219 

.232 

61 

10 

.185 

.166 

.051 

.164 

.154 

.050 

.054 

62 

10 

.343 

.281 

.067 

.282 

.252 

.055 

.058 

63 

11 

.091 

.100 

.042 

.097 

.101 

.047 

.050 

64 

24 

.359 

.292 

.054 

.290 

.255 

.036 

.038 

94 


Table  4-6:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1997 
(continued) 


Domain 

Tli 

Direct 

HB 

se(HB) 

EB 

A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A = 1 

65 

19 

.091 

.117 

.038 

.126 

.144 

.042 

.045 

66 

22 

.328 

.263 

.055 

.263 

.230 

.035 

.037 

67 

35 

.174 

.166 

.030 

.172 

.172 

.030 

.032 

68 

44 

.278 

.240 

.035 

.239 

.220 

.027 

.029 

69 

1 

.000 

.043 

.116 

.059 

.089 

.157 

.166 

70 

1 

.000 

.038 

.108 

.023 

.035 

.146 

.155 

71 

1 

.000 

.079 

.163 

.115 

.173 

.206 

.218 

72 

1 

.000 

.087 

.171 

.123 

.184 

.214 

.227 

73 

66 

.089 

.119 

.025 

.110 

.120 

.023 

.025 

74 

51 

.132 

.143 

.023 

.133 

.134 

.025 

.027 

75 

97 

.075 

.108 

.024 

.098 

.110 

.019 

.021 

76 

53 

.232 

.210 

.027 

.208 

.196 

.025 

.027 

77 

76 

.208 

.206 

.020 

.212 

.214 

.022 

.024 

78 

79 

.171 

.169 

.019 

.173 

.174 

.020 

.022 

79 

168 

.179 

.179 

.014 

.183 

.185 

.015 

.016 

80 

91 

.357 

.299 

.038 

.305 

.279 

.020 

.021 

81 

3 

.000 

.080 

.103 

.113 

.169 

.121 

.128 

82 

11 

.000 

.051 

.047 

.064 

.096 

.052 

.055 

83 

7 

.000 

.074 

.073 

.105 

.157 

.078 

.083 

84 

9 

.183 

.175 

.063 

.184 

.185 

.059 

.063 

85 

55 

.055 

.097 

.031 

.081 

.094 

.025 

.027 

86 

32 

.063 

.089 

.028 

.082 

.092 

.030 

.032 

87 

94 

.126 

.140 

.018 

.131 

.133 

.019 

.020 

88 

44 

.146 

.149 

.025 

.148 

.149 

.027 

.028 

89 

102 

.173 

.180 

.018 

.189 

.198 

.019 

.021 

90 

78 

.188 

.183 

.019 

.185 

.184 

.021 

.022 

91 

167 

.234 

.213 

.018 

.213 

.203 

.015 

.016 

92 

121 

.297 

.255 

.029 

.257 

.238 

.017 

.018 

93 

12 

.000 

.075 

.062 

.110 

.165 

.059 

.063 

94 

14 

.000 

.059 

.050 

.081 

.121 

.050 

.053 

95 

19 

.000 

.072 

.054 

.091 

.136 

.047 

.050 

96 

11 

.156 

.166 

.058 

.183 

.197 

.058 

.062 

95 


Table  4-7:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1998 


Domain 

Direct 

HB 

se(HB) 

EB 
A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A = 1 

1 

3 

.000 

.024 

.058 

.061 

.092 

.096 

.102 

2 

9 

,168 

.165 

.032 

.161 

.158 

.049 

.052 

3 

27 

.049 

.059 

.020 

.065 

.073 

.028 

.030 

4 

21 

.018 

.032 

.021 

.045 

.058 

.030 

,032 

5 

13 

.199 

.184 

.038 

.174 

.162 

.042 

.045 

6 

20 

.234 

.224 

.031 

.213 

.202 

.040 

,043 

7 

75 

.041 

.055 

.016 

.069 

.083 

.018 

.019 

8 

50 

.110 

.114 

.017 

.119 

.124 

.023 

.025 

9 

2 

.000 

.024 

.070 

.051 

.077 

.115 

.122 

10 

2 

.000 

.020 

.066 

.037 

.056 

.107 

.114 

11 

8 

.134 

.145 

.047 

.154 

.164 

.068 

.072 

12 

8 

.000 

.029 

.044 

.061 

.092 

.063 

.066 

13 

6 

.000 

.026 

.043 

.050 

.075 

.062 

.066 

14 

7 

.291 

.269 

.055 

.250 

.229 

.063 

.067 

15 

29 

.049 

.063 

.023 

.075 

.088 

.029 

.031 

16 

16 

.078 

.086 

.024 

.095 

.104 

.037 

.039 

17 

17 

.339 

.310 

.046 

.287 

.261 

.043 

.046 

18 

19 

.101 

.109 

.028 

.115 

.122 

.039 

.042 

19 

68 

.076 

.085 

.015 

.096 

.106 

.020 

.021 

20 

49 

.155 

.149 

.017 

.144 

.139 

.022 

.023 

21 

0 

- 

- 

- 

- 

- 

- 

- 

22 

1 

.000 

.043 

.134 

.104 

.157 

.212 

.225 

23 

14 

.067 

.090 

.039 

.113 

.136 

.053 

.056 

24 

11 

.000 

.029 

.039 

.062 

.093 

.053 

.056 

25 

21 

.058 

.069 

.027 

.085 

.099 

.037 

.039 

26 

17 

.055 

.076 

.034 

.103 

.128 

.044 

.047 

27 

34 

.100 

.101 

.020 

.101 

.101 

.026 

.027 

28 

19 

.226 

.209 

.031 

.195 

.179 

.036 

.038 

29 

35 

.133 

.136 

.020 

.138 

.140 

.029 

.030 

30 

23 

.229 

.215 

.031 

.197 

.180 

.036 

.038 

31 

70 

.035 

.048 

.016 

.056 

.067 

.019 

.020 

32 

54 

.181 

.174 

.019 

.164 

.156 

.022 

.024 

96 


Table  4-8:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1998 
(continued) 


Domain 

rii 

Direct 

HB 

se(HB) 

EB 
A = .5 

EB 
A = 1 

se(EB) 
A=  .5 

se(EB) 
A=  1 

33 

3 

.000 

.036 

.073 

.081 

.121 

.114 

.121 

34 

2 

.000 

.032 

.081 

.065 

.098 

.128 

.136 

35 

9 

.000 

.033 

.046 

.069 

.104 

.063 

.067 

36 

4 

.000 

.020 

.047 

.035 

.053 

.078 

.083 

37 

10 

.113 

.126 

.039 

.131 

.141 

.057 

.060 

38 

11 

.000 

.023 

.034 

.048 

.073 

.049 

.052 

39 

33 

.067 

.074 

.020 

.081 

.087 

.027 

.029 

40 

24 

.211 

.195 

.028 

.180 

.165 

.031 

.033 

41 

34 

.090 

.102 

.022 

.111 

.122 

.030 

.031 

42 

20 

.107 

.117 

.028 

.125 

.134 

.040 

.042 

43 

96 

.090 

.094 

.011 

.097 

.100 

.016 

.017 

44 

73 

.141 

.136 

.014 

.131 

.126 

.019 

.020 

45 

8 

.000 

.029 

.044 

.060 

.091 

.064 

.068 

46 

2 

.000 

.026 

.072 

.049 

.073 

.119 

.126 

47 

16 

.000 

.026 

.032 

.049 

.073 

.042 

.044 

48 

7 

.100 

.114 

.046 

.123 

.134 

.068 

.072 

49 

5 

.377 

.324 

.107 

.298 

.259 

.071 

.075 

50 

4 

.000 

.016 

.044 

.026 

.040 

.071 

.076 

51 

19 

.049 

.061 

.024 

.072 

.083 

.036 

.038 

52 

30 

.221 

.205 

.031 

.193 

.179 

.030 

.032 

53 

15 

.223 

.209 

.037 

.193 

.178 

.041 

.044 

54 

21 

.152 

.145 

.028 

.142 

.137 

.036 

.038 

55 

53 

.122 

.121 

.015 

.119 

.117 

.021 

.022 

56 

62 

.281 

.258 

.027 

.234 

.211 

.023 

.024 

57 

1 

.000 

.011 

.064 

.019 

.029 

.107 

.113 

58 

0 

- 

- 

- 

- 

- 

- 

- 

59 

1 

.000 

.023 

.075 

.029 

.044 

.156 

.165 

60 

2 

.000 

.028 

.075 

.062 

.093 

.118 

.125 

61 

12 

.130 

.124 

.027 

.116 

.109 

.036 

.039 

62 

9 

.000 

.015 

.028 

.030 

.045 

.045 

.048 

63 

23 

.085 

.090 

.021 

.091 

.095 

.031 

.033 

64 

16 

.284 

.262 

.048 

.246 

.227 

.046 

.049 

97 


Table  4-9:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year  1998 
(continued) 


Domain 

Direct 

HB 

se(HB) 

EB 

A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A = 1 

65 

17 

.168 

.159 

.034 

.150 

.141 

.038 

.040 

66 

17 

.122 

.121 

.027 

.121 

.121 

.038 

.041 

67 

42 

.083 

.089 

.017 

.093 

.097 

.023 

.025 

68 

48 

.292 

.268 

.031 

.245 

.221 

.026 

.028 

69 

1 

.000 

.010 

.061 

.019 

.029 

.108 

.114 

70 

0 

- 

- 

- 

- 

- 

- 

- 

71 

3 

.274 

.265 

.067 

.241 

.224 

.102 

.108 

72 

2 

.546 

.477 

.134 

.400 

.327 

.113 

.120 

73 

55 

.200 

.200 

.019 

.196 

.194 

.025 

.027 

74 

18 

.000 

.025 

.031 

.048 

.071 

.040 

.043 

75 

64 

.120 

.126 

.017 

.134 

.141 

.022 

.024 

76 

47 

.304 

.284 

.027 

.262 

.241 

.027 

.029 

77 

87 

.228 

.220 

.017 

.209 

.200 

.019 

.021 

78 

67 

.126 

.130 

.015 

.133 

.136 

.021 

.022 

79 

'114 

.159 

.158 

.012 

.157 

.156 

.016 

.017 

80 

119 

.321 

.293 

.026 

.265 

.237 

.016 

.017 

81 

8 

.000 

.025 

.040 

.050 

.075 

.059 

.062 

82 

17 

.071 

.090 

.031 

.105 

.122 

.044 

.046 

83 

11 

.000 

.027 

.038 

.058 

.087 

.052 

.055 

84 

11 

.000 

.024 

.035 

.049 

.073 

.050 

.053 

85 

42 

.121 

.134 

.022 

.146 

.158 

.029 

.031 

86 

23 

.000 

.024 

.028 

.047 

.071 

.035 

.037 

87 

49 

.097 

.115 

.023 

.129 

.146 

.027 

.029 

88 

53 

.161 

.158 

.018 

.157 

.155 

.024 

.025 

89 

87 

.295 

.277 

.022 

.257 

.237 

020 

.021 

90 

75 

.093 

.103 

.016 

.112 

.122 

.020 

.022 

91 

139 

.124 

.128 

.011 

.132 

.136 

.015 

.016 

92 

138 

.274 

.254 

.020 

.235 

.216 

.015 

.016 

93 

11 

.000 

.029 

.040 

.058 

.087 

.054 

.057 

94 

24 

.046 

.069 

.029 

.092 

.116 

.038 

.040 

95 

18 

.000 

.031 

.035 

.063 

.094 

.043 

.046 

96 

16 

.063 

.082 

.034 

.106 

.127 

.044 

.047 

98 


Table  4-10:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 
1999 


Domain 

rii 

Direct 

HB 

se(HB) 

EB 

A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A = 1 

1 

4 

.000 

.025 

.050 

.048 

.072 

.075 

.079 

2 

5 

.000 

.021 

.040 

.034 

.051 

.059 

.063 

3 

28 

.000 

.020 

.022 

.034 

.050 

.026 

.028 

4 

25 

.104 

.108 

.023 

.107 

.108 

.030 

.032 

5 

21 

.214 

.203 

.035 

.195 

.186 

.039 

.042 

6 

19 

.053 

.068 

.028 

.083 

.098 

.038 

.041 

7 

70 

.104 

.112 

.016 

.119 

.126 

.020 

.021 

8 

54 

.253 

.227 

.028 

.218 

.201 

.023 

.024 

9 

2 

.000 

.043 

.096 

.045 

.067 

.136 

.145 

10 

5 

.000 

.030 

.052 

.052 

.078 

.074 

.079 

11 

8 

.119 

.129 

.049 

.124 

.126 

.065 

.069 

12 

6 

.000 

.038 

.058 

.043 

.064 

.079 

.084 

13 

9 

.146 

.149 

.043 

.140 

.138 

.060 

.063 

14 

11 

.000 

.025 

.037 

.048 

.072 

.050 

.053 

15 

28 

.083 

.086 

.022 

.086 

.087 

.028 

.030 

16 

20 

.199 

.183 

.032 

.173 

.160 

.034 

.036 

17 

22 

.066 

.083 

.028 

.076 

.111 

.037 

.039 

18 

24 

.053 

.069 

.026 

.088 

.106 

.033 

.035 

19 

84 

.126 

.129 

.014 

.133 

.136 

.019 

.020 

20 

61 

.207 

.191 

.023 

.189 

.180 

.022 

.023 

21 

2 

.000 

.041 

.092 

.044 

.067 

.136 

.144 

22 

3 

.000 

.028 

.062 

.054 

.081 

.093 

.099 

23 

11 

.000 

.034 

.043 

.056 

.084 

.056 

.059 

24 

9 

.000 

.040 

.050 

.047 

.070 

.064 

.068 

25 

12 

.110 

.114 

.039 

.107 

.106 

.047 

.050 

26 

11 

.000 

.024 

.034 

.039 

.058 

.046 

.049 

27 

22 

.000 

.021 

.025 

.029 

.043 

.030 

.032 

28 

15 

.000 

.029 

.036 

.040 

.060 

.046 

.049 

29 

26 

.215 

.193 

.037 

.195 

.184 

.033 

.036 

30 

18 

.000 

.028 

.033 

.049 

.073 

.039 

.042 

31 

62 

.056 

.073 

.020 

.083 

.096 

.022 

.023 

32 

40 

.156 

.151 

.021 

.153 

.151 

.028 

.029 

99 


Table  4-11:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 
1999  (continued) 


Domain 

rii 

Direct 

HB 

se(HB) 

EB 
A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A = 1 

33 

3 

.000 

.030 

.065 

.056 

.083 

.100 

.102 

34 

1 

.000 

.023 

.091 

.041 

.061 

.146 

.154 

35 

11 

.000 

.033 

.042 

.045 

.068 

.055 

.058 

36 

2 

.000 

.027 

.074 

.075 

.112 

.112 

.119 

37 

9 

.315 

.271 

.062 

.257 

.228 

.052 

.055 

38 

7 

.000 

.027 

.044 

.041 

.062 

.060 

.064 

39 

26 

.000 

.030 

.031 

.045 

.067 

.034 

.036 

40 

18 

.000 

.021 

.027 

.036 

.055 

.035 

.037 

41 

45 

.159 

.152 

.023 

.156 

.155 

.025 

.027 

42 

21 

.127 

.129 

.028 

.129 

.130 

.037 

.039 

43 

85 

.097 

.106 

.015 

.109 

.115 

.019 

.020 

44 

61 

.121 

.123 

.017 

.131 

.136 

.023 

.024 

45 

4 

.000 

.038 

.066 

.056 

.084 

.094 

.100 

46 

5 

.157 

.157 

.060 

.166 

.170 

.078 

.082 

47 

16 

.090 

.106 

.037 

.107 

.116 

.047 

.050 

48 

5 

.415 

.355 

.092 

.337 

.299 

.075 

.080 

49 

10 

.324 

.273 

.065 

.258 

.224 

.047 

.050 

50 

7 

.000 

.019 

.035 

.020 

.025 

.051 

.055 

51 

14 

.135 

.131 

.032 

.128 

.125 

.039 

.042 

52 

27 

.211 

.192 

.030 

.175 

.156 

.031 

.033 

53 

16 

.063 

.074 

.030 

.087 

.099 

.040 

.043 

54 

25 

.194 

.184 

.030 

.177 

.168 

.035 

.037 

55 

50 

.144 

.140 

.019 

.148 

.150 

.023 

.025 

56 

61 

.208 

.194 

.022 

.190 

.181 

.022 

.023 

57 

3 

.371 

.335 

.103 

.290 

.250 

.110 

.116 

58 

1 

1.00 

.839 

.259 

.688 

.532 

.188 

.200 

59 

1 

.000 

.022 

.089 

.069 

.104 

.143 

.152 

60 

5 

.347 

.300 

.095 

.284 

.252 

.074 

.078 

61 

10 

.104 

.104 

.036 

.109 

.111 

.045 

.048 

62 

12 

.146 

.148 

.035 

.139 

.136 

.049 

.052 

63 

19 

.102 

.103 

.026 

.108 

.110 

.035 

.037 

64 

23 

.181 

.171 

.029 

.158 

.146 

.034 

.036 

100 


Table  4-12:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 
1999  (continued) 


Domain 

rii 

Direct 

HB 

se(HB) 

EB 

A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A = 1 

65 

21 

.140 

.139 

.030 

.140 

.140 

.037 

.040 

66 

27 

.419 

.364 

.053 

.326 

.280 

.034 

.036 

67 

40 

.151 

.148 

.020 

.150 

.150 

.025 

.027 

68 

59 

.130 

.131 

.019 

.133 

.135 

.024 

.025 

69 

0 

- 

- 

- 

- 

- 

- 

- 

70 

1 

.000 

.040 

.125 

.064 

.095 

.188 

.200 

71 

1 

.000 

.023 

.092 

.071 

.107 

.145 

.154 

72 

2 

.000 

.039 

.088 

.050 

.074 

.132 

.140 

73 

20 

.143 

.138 

.027 

.122 

.112 

.033 

.035 

74 

18 

.000 

.023 

.028 

.031 

.046 

.036 

.039 

75 

46 

.096 

.104 

.019 

.098 

.098 

.024 

.026 

76 

38 

.079 

.090 

.022 

.083 

.085 

.027 

.029 

77 

49 

.235 

.216 

.026 

.204 

.189 

.024 

.026 

78 

61 

.116 

.122 

.017 

.125 

.130 

.022 

.024 

79 

98 

.151 

.150 

.014 

.146 

.143 

.018 

.019 

80 

86 

.264 

.241 

.025 

.223 

.202 

.020 

.021 

81 

7 

.000 

.034 

.050 

.055 

.082 

.068 

.072 

82 

10 

.065 

.078 

.036 

.101 

.120 

.051 

.054 

83 

18 

.052 

.075 

.034 

.084 

.101 

.043 

.046 

84 

7 

.000 

.036 

.052 

.047 

.070 

.070 

.075 

85 

23 

.164 

.164 

.028 

.142 

.131 

.036 

.039 

86 

27 

.106 

.113 

.025 

.112 

.115 

.032 

.034 

87 

55 

.071 

.083 

.018 

.079 

.083 

.022 

.024 

88 

41 

.052 

.068 

.021 

.067 

.075 

.025 

.027 

89 

60 

.162 

.160 

.018 

.158 

.156 

.022 

.024 

90 

59 

.156 

.152 

.018 

.151 

.149 

.022 

.023 

91 

129 

.144 

.143 

.012 

.144 

.144 

.015 

.016 

92 

93 

.283 

.255 

.027 

.238 

.216 

.018 

.020 

93 

6 

.000 

.030 

.048 

.061 

.092 

.068 

.072 

94 

12 

.000 

.032 

.040 

.064 

.096 

.051 

.054 

95 

18 

.053 

.079 

.036 

.085 

.101 

.045 

.048 

96 

12 

.041 

.068 

.042 

.077 

.095 

.056 

.060 

101 


Table  4-13:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 
2000 


Domain 

rii 

Direct 

HB 

se(HB) 

EB 
A = .5 

EB 
A = 1 

se(EB) 
A=  .5 

se(EB) 
A = 1 

1 

10 

.126 

.133 

.043 

.148 

.158 

.057 

.060 

2 

0 

- 

- 

- 

- 

- 

- 

- 

3 

24 

.063 

.074 

.025 

.076 

.082 

.037 

.039 

4 

28 

.146 

.150 

.027 

.163 

.171 

.041 

.043 

5 

20 

.138 

.143 

.032 

.153 

.160 

.043 

.046 

6 

17 

.112 

.120 

.032 

.134 

.144 

.019 

.021 

7 

78 

.097 

.104 

.015 

.107 

.112 

.022 

.024 

8 

66 

.274 

.253 

.023 

.240 

.224 

.072 

.076 

9 

5 

.173 

.164 

.061 

.160 

.154 

.078 

.082 

10 

6 

.000 

.033 

.051 

.082 

.123 

.070 

.074 

11 

7 

.000 

.032 

.047 

.090 

.134 

.054 

.057 

12 

11 

.335 

.302 

.056 

.275 

.245 

.060 

.064 

13 

7 

.134 

.134 

.045 

.130 

.128 

.103 

.110 

14 

2 

.000 

.020 

.064 

.026 

.039 

.031 

.033 

15 

27 

.000 

.023 

.023 

.035 

.052 

.032 

.034 

16 

29 

.113 

.119 

.024 

.123 

.127 

.033 

.035 

17 

27 

.120 

.127 

.025 

.141 

.152 

.044 

.047 

18 

14 

.000 

.024 

.030 

.041 

.062 

.019 

.021 

19 

77 

.131 

.133 

.015 

.133 

.134 

.021 

.023 

20 

75 

.223 

.213 

.018 

.207 

.200 

.089 

.095 

21 

3 

.000 

.022 

.056 

.028 

.043 

.070 

.074 

22 

6 

.000 

.026 

.045 

.052 

.079 

.071 

.075 

23 

8 

.000 

.037 

.050 

.108 

.162 

.063 

.067 

24 

9 

.000 

.029 

.042 

.062 

.093 

.052 

.055 

25 

10 

.000 

.023 

.034 

.031 

.046 

.061 

.065 

26 

6 

.000 

.020 

.039 

.029 

.044 

.031 

.033 

27 

32 

.098 

.105 

.023 

.108 

.114 

.035 

.037 

28 

23 

.000 

.024 

.025 

.037 

.055 

.032 

.034 

29 

25 

.187 

.173 

.030 

.151 

.134 

.035 

.037 

30 

23 

.227 

.210 

.032 

.188 

.169 

.021 

.022 

31 

71 

.118 

.123 

.016 

.125 

.128 

.024 

.026 

32 

50 

.109 

.113 

.019 

.112 

.113 

.113 

.120 

102 


Table  4-14:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 
2000  (continued) 


Domain 

Direct 

HB 

se(HB) 

EB 
A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A=  1 

33 

2 

.000 

.024 

.071 

.037 

.055 

.115 

.122 

34 

2 

.000 

.026 

.073 

.047 

.070 

.058 

.061 

35 

8 

.108 

.113 

.042 

.112 

.114 

.067 

.071 

36 

7 

.000 

.030 

.045 

.065 

.098 

.051 

.054 

37 

9 

.062 

.069 

.035 

.062 

.063 

.036 

.038 

38 

17 

.000 

.019 

.024 

.023 

.034 

.037 

.040 

39 

24 

.117 

.124 

.028 

.134 

.142 

.040 

.043 

40 

20 

.000 

.028 

.029 

.052 

.078 

.025 

.027 

41 

50 

.163 

.160 

.020 

.156 

.153 

.027 

.029 

42 

38 

.141 

.139 

.021 

.133 

.130 

.020 

.022 

43 

76 

.104 

.112 

.016 

.120 

.128 

.020 

.022 

44 

73 

.142 

.142 

.016 

.139 

.137 

.119 

.127 

45 

2 

.000 

.027 

.076 

.051 

.076 

.090 

.095 

46 

3 

.000 

.021 

.056 

.023 

.035 

.052 

.055 

47 

10 

.000 

.024 

.034 

.044 

.066 

.068 

.072 

48 

7 

.000 

.029 

.045 

.068 

.102 

.051 

.054 

49 

10 

.087 

.095 

.037 

.099 

.105 

.078 

.083 

50 

5 

.000 

.027 

.050 

.053 

.080 

.032 

.034 

51 

23 

.038 

.053 

.023 

.056 

.066 

.037 

.039 

52 

21 

.243 

.223 

.037 

.198 

.176 

.030 

.032 

53 

31 

.114 

.120 

.022 

.121 

.124 

.040 

.042 

54 

18 

.202 

.195 

.031 

.188 

.182 

.019 

.020 

55 

74 

.094 

.102 

.015 

.102 

.106 

.019 

.020 

56 

83 

.204 

.192 

.017 

.178 

.165 

.133 

.141 

57 

2 

.000 

.029 

.082 

.062 

.092 

.146 

.154 

58 

1 

.000 

.019 

.087 

.023 

.035 

000 

.000 

59 

2 

.000 

.020 

.063 

.021 

.032 

.103 

.194 

60 

8 

.112 

.120 

.044 

.132 

.143 

.059 

.063 

61 

16 

.202 

.187 

.036 

.169 

.152 

.040 

.043 

62 

3 

.301 

.276 

.086 

.252 

.227 

.100 

.107 

63 

33 

.055 

.069 

.020 

.073 

.082 

.028 

.030 

64 

28 

.105 

.112 

.024 

.115 

.120 

.032 

.034 

103 


Table  4-15:  Small  Area  Estimates  of  the  Proportions  of  Uninsured  Asian:  year 
2000  (continued) 


Domain 

Tli 

Direct 

HB 

se(HB) 

EB 
A = .5 

EB 
A = 1 

se(EB) 
A = .5 

se(EB) 
A = 1 

65 

33 

.126 

.129 

.021 

.126 

.126 

.029 

.031 

66 

13 

.393 

.350 

.054 

.323 

.288 

.048 

.051 

67 

70 

.080 

.089 

.015 

.088 

.093 

.019 

.021 

68 

75 

.179 

.171 

.017 

.159 

.149 

.019 

.021 

69 

1 

.000 

.851 

.248 

.705 

.558 

.163 

.173 

70 

2 

.361 

.331 

.098 

.299 

.268 

.119 

.126 

71 

4 

.000 

.023 

.050 

.032 

.048 

.077 

.082 

72 

2 

.000 

.045 

.101 

.157 

.236 

.155 

.165 

73 

45 

.271 

.256 

.026 

.256 

.249 

.028 

.030 

74 

10 

.000 

.024 

.034 

.034 

.051 

.051 

.055 

75 

83 

.149 

.150 

.016 

.160 

.166 

.020 

.021 

76 

59 

.113 

.120 

.018 

.128 

.136 

.023 

.024 

77 

68 

.338 

.313 

.025 

.302 

.284 

.023 

.024 

78 

39 

.098 

.103 

.020 

.102 

.104 

.026 

.028 

79 

122 

.110 

.117 

.013 

.125 

.133 

.016 

.017 

80 

125 

.308 

.281 

.020 

.262 

.239 

.016 

.017 

81 

7 

.000 

.029 

.043 

.066 

.099 

.065 

.069 

82 

12 

.000 

.025 

.032 

.047 

.070 

.048 

.051 

83 

13 

.049 

.068 

.035 

.088 

.108 

.050 

.053 

84 

4 

.000 

.028 

.056 

.060 

.091 

.088 

.093 

85 

32 

.189 

.193 

.027 

.217 

.231 

.035 

.037 

86 

10 

.136 

.137 

.036 

.127 

.123 

.051 

.054 

87 

52 

.192 

.185 

.021 

.184 

.180 

.024 

.026 

88 

65 

.153 

.155 

.018 

.162 

.166 

.022 

.024 

89 

71 

.285 

.265 

.022 

.256 

.242 

.022 

.023 

90 

57 

.086 

.095 

.017 

.102 

.110 

.022 

.024 

91 

153 

.149 

.150 

.011 

.156 

.160 

.014 

.015 

92 

138 

.308 

.283 

.020 

.266 

.244 

.015 

.017 

93 

10 

.000 

.030 

.041 

.073 

.110 

.059 

.063 

94 

16 

.067 

.081 

.029 

.090 

.101 

.042 

.044 

95 

18 

.108 

.123 

.032 

.145 

.163 

.046 

.049 

96 

14 

.111 

.125 

.039 

.160 

.185 

.050 

.053 

CHAPTER  5 

CONCLUDING  REMARKS  AND  SCOPE  EOR  FUTURE  WORK 
The  primary  focus  of  this  dissertation  has  been  in  initiating  and  developing 
empirical  and  hierarchical  Bayesian  methodology  for  estimation  of  finite  population 
strata  means  based  on  a simple  regression  model  when  the  observed  covariates 
are  subject  to  measurement  error.  We  have  considered  both  the  functional  and 
structural  measurement  error  models.  The  former  refers  to  the  situation  when  the 
unmeasured  covariates  are  non-stochastic,  while  the  latter  refers  to  the  situation 
when  they  are  stochastic.  The  major  application  of  this  approach  is  in  the  context 
of  small  area  estimation  when  there  is  a large  number  of  areas,  but  the  number  of 
observations  per  area  is  relatively  small. 

The  second  part  of  this  dissertation  is  small  area  estimation  based  on  binary 
data.  Once  again  both  empirical  and  hierarchical  Bayesian  estimators  are  devel- 
oped. The  results  are  applied  to  estimation  of  the  proportion  of  uninsured  persons 
in  different  cross-sections  of  minority  populations. 

A very  natural  extension  of  the  first  part  of  our  work  is  to  consider  a general 
multiple  regression  model  which  can  accomodate  multiple  covariates.  A potential 
technical  development  is  second  order  approximation  of  Bayes  risks. 

An  immediate  extension  of  the  second  part  of  this  dissertation  is  for  one- 
parameter  natural  exponential  family  of  distributions.  In  this  way,  we  can  acco- 
modate analysis  of  both  discrete  and  continuous  data.  One  important  example  is 
count  data  which  occur  quite  often  in  rare  disease  analysis. 


104 


APPENDIX 


APPENDIX  : Chapter  2 

The  following  lemma  will  be  very  useful  in  proving  all  the  results. 

Lemma  2.2.1.  Let  Ti,  T2, . . . , be  independently  distributed  with  T\  ~ A’(/i,  1), 


and  T2,.  ■ . ,Tg  ~ N(0, 1).  Then  writing  A = 

(i)  E{T^/j:iTn  = E[{2K  + l)/{2K  + s)]- 

(ii)  E{T,/J:lT^)  = ^^E[{2K  + s)-^]■, 

(hi)  = 

(iv)  E[rf/(Ei  7;^)^]  = E[{2K  + 1)(2/^  + s)-\2K  + s - 2)-i], 


where  K ~ Poisson  (A). 

Proof. 

(i)  Conditional  on  K = ■ ■ ■ ,T^  are  mutually  independent  with 

7?  ~ X2K+1  T2,...  ,T^  iid  Xi  and  K ~ Poisson  (A).  Hence,  conditional 


k)  = {2k  + l)/{2k  + s).  The  result  follows. 

(ii)  Conditional  on  K = k,  ~ xlk+s  and  K ~ Poisson  (A).  This  leads  to 

the  identity 


onK  = k,  Tl/Y.\Tf  ~ Beta(i(2fc  + 1),  |(s  - 1)).  Hence,  E(TiV 7)'l^ 


J]exp(-A)  — (2fc-2  + s)  ^ 


A:=0 

k 


= e{J2t^)-^ 


(A.2.1) 


105 


106 


Since  A = differentiating  both  sides  of  (A. 2.1)  with  respect  to  /r,  one  gets 


fc=0 

= E 


-1 


°°  \k  ( ^ 

Y^exp{-\)  — {2k  + s)-^  - /i£i  J 

(t.-m)  j En 

=e\t, 

The  result  follows  now  from  (A. 2.2). 

(hi)  Similar  to  (A.2.1),  we  begin  with  the  identity 

A'' 


fc=0 


^exp(-A)— £;(x2fc+J  5 

S \~2 

E{t,Tn 


exp 


\-m  - 


' —oo  J —oo 


Differentiating  both  sides  of  (A. 2. 3)  with  respect  to  fi, 


clTi . . . dTk- 


k=0 


^exp(-A)— £ {xlk+2+s)  " 


= E 


1 “I 

2 


(A.2.2) 


(A.2.3) 


= E 


Ti 


(A.2.4) 


The  result  follows  now  from  (A.2.4). 

(iv)  Conditional  on  K = k,  Tf  and  Tf  are  mutually  independent  with 

~ Beta{?yi,5=l)  and  -£1^  ~ xli+n  Hence.  E{T?/{£’T^n  = 
[{2k  + l)/(2fc  + s)]{2fc  + 5 - 2)“^  This  proves  the  result. 


107 


Proof  of  Theorem  2.2.1.  By  the  independence  of  (yi, . . . , ym)  and  (Xi, . . . , X^), 


E(bi)  = EE 


Zn^UX^-X) 

J2ni{Xi-Xy 


Xi,.. 


= E 


J2ni{bo  + biXi){Xj  - X) 
En,(X,-X)2 


= biE 


J2niXj{Xi  - X) 

Eni(X,-X)2_ 


= b:E 


Y,riiXi{xi  - x) 


(A.2.5) 


Y.n^{X,-Xf\ 

We  now  introduce  the  orthogonal  transformation  (Zi, . . . , Zmf  = C{y/n{Xi, . . . , ^/n^XrnY , 
where  C is  an  orthogonal  matrix  with  first  two  rows  given  by  (yhi/  > V^/  v^) 


and 


v/wl(xi-x) 


. , V Then, 

’ {Erii(xi-x)2}2 


^niXj(x,  - x)  = |^nj(a:i  - Z2  and  ^ nj(Xj  - X)^  - ^ 

Hence,  from  (A.2.5), 

E(b,)  = b,[Y^m{x,-xfY  e(z^  / f^zM 


'I  - / / 

= fei  ( ^2  / 


n- 


(A.2.6) 


where  = Za/a^  ~ N j and  T,  = ~ X(0, 1)  for 


3 < i < rn. 


108 


To  see  this,  first  we  observe  that 

E(yZ2)  — 'y  ^ ^ 'y  ^ 2^)  ^ ^ rij(Xj  x)  ^ . 

Next,  for  every  3 < i < m,  writing  the  row  of  C as  {cn, . . . , Cj^),  we  get  the  two 
identities 

1 

m m / f ^ 1 ^ 

Y^Cik^Jnrkly/n^  = Q and  ^ Qfcy^(xfe  - x)  / <J]nfe(xfc-x)H  =0 
fe=i  fc=i  / I fc=i  J 

or  equivalently 

m m 

^ CjfcV^  = 0 and  ^ Cik^/n\^{xk  - x)  = 0. 

A;=l  fc=l 

Together  they  imply  Cik^/nkXk  = 0 which  is  equivalent  to  E[Zi)  = 0, 

3 <i  < m. 

Next,  applying  part  (ii)  of  Lemma  2.2.1,  one  gets  from  (A. 2. 6), 

E{b,)  = h - xf}"  {J2n,{x,  - x)^}'  a-^E{2Km  + m-  I)”' 

= bi  [Y,  n^{x^  - x)V(m  - 1)]  a-^E[{rn  - 1)(2/L„  + rn-  I)”'],  (A.2.7) 

where  ~ Poisson  Since  E{Km/{m  - 1))  c/  {2al)  and 

V{Krn/{m  - 1))  = S Mx^xf  j ^ Q p,y  assumption  of  the  theorem, 
Km/iixi  - 1)  c/  (2cr^).  Now,  by  the  dominated  convergence  theorem, 


E[(m-l)(2iL„  + m-l)-^] 


E 


1 + 


2ATm 
m — I 


1 + 


<7; 


al  + c 


-1 


109 


Accordingly,  from  (A. 2. 7), 


E{k)  hica^  + c)  ^ = hic/ {a^  + c). 


Next,  by  the  iterated  formula  for  the  variance. 


y(6i)  = y[E(6i|Xi, . . . ,x„)]  + E[v(h\x,, . . . ,x„)] 


'Y,'riiX^{xi  - x) 

-X-  P 

'j2nUa^^  + a^Jn,m-Xf 

[Y.n,{X,-Xf\ 

T -LX 

{j:n,ix.-xyr 

= ^'^'rii{xi  - x)^|  V 


+ <E 


- znm-x)^ ' 

_j2n,{x,-xy}\ 


+ alE  \^ni[Xi~Xf  ' 
Now,  by  parts  (iv)  and  (ii)  of  Lemma  1, 


(A.2.8) 


E 


2 


= afE  [{2K^  + l){2Km  + m-  l)-\2Km  + m-  3)-'] 


{ - x)2|  [E{2Km  + m-  (A.2.9) 


no 


Hence,  from  (A. 2. 9), 

rii{xi  - xfV  I Z2  / ^ 


= cr^  ^ “ xf/{m  - 1) 


E 


2Km  . ^ ^ fi  + 


m — I m — I J \ m — I 


1 - 


2 2K„ 


m — 1 m — 1 


-1 


- a„ 


_4  f ~ 

m — 1 


E 


m — 1 


2Km  + m — 1 


-2 


(^r,  C (c/(t2)  (1+^2 


Further,  by  Assumption  (i),  J2^ ~ ^ J2T 

E [T,Trii{Xi  - X)2]“^  = E[2K^  + m - 1]-^  ^ 0.  Hence,  from  (A.2.8)  and 
(A.2.10),  H(5i)  ^ 0 as  m ^ 00.  The  proof  of  Theorem  2.2.1  is  complete. 

Proof  of  Theorem  2.2.2 
Direct  calculations  yield 

m 

E{MSBy)  = ^ni[l/(yi-y)  + {F;(yi-y)}']/(m-l) 


Y ni  \{(xl/ni  + al)  + Y + ^l) 

1 


-2  rii{al/ni  + al)  + {bo  + biXi  - bo  - b^x)  /{m  - 1) 


al{m  -l)  + al  Qm  + b^Y 


9m/ {m  -l)+blY  C'^  “ I)' 


Ill 


This  proves  the  first  part  of  the  theorem.  To  prove  the  second  part,  writing  r.;  = 
y/fTiyi  (i  = 1,. . ,,m),  E{ri)  = y/n~i{bo  + biXi)  = ^i[bo  + biX  + bi{xi-x)]  and  V{ri)  = 
al  + Uiol.  Thus,  writing  r = (ri, . . . , r^Y , = [^/n[{xl  -x),...,  ^Jn^{x.,n  ~ S)]^, 

= Diag(ni,...,n„), 

E{r)  = s/n^{bQ  + bix)UmE  l/(r)  = cr^Jm  + (A.2.11) 


where  we  recall  that  = (y^/y^, . . . , Yn^/y/n^)-  Now,  by  part  (ii)  of 
Theorem  1 of  Searle  (1971), 


V{MSBy)  = 2{m 


1)- 


U{  Yjm  + CTuDrYYn 


Urr,U 


mJ 


+ 2'riT 


^(6o  + bix)vY 


^(6o  + bix)u^  + q 


(A.2.12) 


112 


Noting  that  u^Um  = 1 and  q'^Um  = 0,  it  follows  that  [{bo  + bix)u^  + q^{Im  — 
u,^ul)  = ql.  Also,  qlD^q^  = YZi  ^ E™  i n,{x.  - xf.  Further, 


{^e^m  "F  0'y^Drn){I m 


O'ti^rn  '^m'f^m)  “I"  '^u  1 Dm{Im  UmU^ 


= Crti'ITT'  - 1)  + 


-Drn  D 


0^2  ^2 


^r(-Dm)  " ‘^U.^D rn^rn 


= aj{m-l)  + al 


2 Y1 


- 2.y,  (nr  - . (A^2.13) 

Since  rii  < K for  all  1 < z < m,  E”i  + (E^i/''^7’)^  — 2E™iV''^r  < + AT^  + 

2iF^  = K'^{m  + 3).  Hence,  from  (A. 2. 12)  and  (A. 2. 13), 

V{MSBy)  < 2(m-l)“2[a^(m-l)  + iFV2(m  + 3)  + 

m 

2alal  + K{m-l)  + KY^rii{xi  - x)^] 

i=l 

= 0{m~^)  -^0  as  m — > oo. 


113 


Proof  of  Theorem  2.3.1  We  rewrite 

^PB  _ ^EB  ^ ^ - (1  - fA)m  - + hx,) 

= /,p,  - B,)y,  + B,{bo  + hX,)  - My  + h{X,  - W)}] 

= fi[{Bi  - Bi){{yi  -bo-  biXi)  - bi{Xi  - Xi)} 

- MiV  -bo-  bix)  + {k  - bi){X,  -X)-  b,{X  - x)}].  (A.2.14) 

Hence,  by  the  elementary  inequality  (a  — b)^  < 2(a^  + 6^)  and  ff  < 1, 

m 


< 2m 


m 

^{Bi  - Bif{{yi  - ho-  hiXi)  - hi{Xi  - Xi)}^ 
i=l 


+ BUiy  -bo-  bix)  + {k  - bi)M  -X)-b,{X-  xM  ■ 

i=l 


(A.2.15) 


114 


First,  we  observe  that 


max  \Bi  — Bi\  = max 

l<i<m  \<i<m 


(j: 


al  + n,al  aj  + rual 


max 

l<i<m 


1 


1 + UiM  1 + 


{M  = all  al,M  = all  al) 


max  < M~^\M  - M[  ^ 0 as  m ^ oo. 


:<i<m  (1  _|_  mM){l  + UiM) 


(A.2.16) 


Further 


m 


-1 


'^E<{yi-bo-  biXi)  - bi{X,  - Xi 


i=l 


m 

i=l 


V{y,)  + blV{X,) 


= m ^ '^{allui  + al  + 

i=l 


<al  + al  + blal  = 0(1), 


(A.2.17) 


115 


by  assumption  (i)  of  the  theorem.  Hence,  from  (A. 2. 16)  and  (A. 2. 17). 

m 
i=l 


< max  \13i 

l<i<m 


m ( 

Bi\^mr^  I {Vi -bo-  biXi)  - bi{Xi 


0 as  m 


CO. 


(A.2.18) 


Also, 


m ^^{Bi-  B,f{{yi  - bo  - biXt)  - bi{Xi  - x^)}^ 

i=l 


i=l 


bo  — biXi)  — bi{Xi  — Xi)  > and 


E 


m 1 

i=l 


biXi) 


b\{Xi  Xj) 


2n  2 


m f 

T’ 

< E 

i=i  [ 

Hi - bo  - biXi)  - bi{Xi  - Xi)  > 

< 8E 


m 

'^{'Vi  - bo 

i=l 


i=l 


= 8 


{al/rii  + alf  + m 

t=i 


< 24 


+ by. 


(A.2.19) 


116 


which  shows  that  the  left  hand  side  of  (A. 2. 17)  is  uniformly  integrable  in  rn  > 1. 


Hence,  by  (A. 2. 18)  and  (A. 2. 19), 


rn 


i=l 


2n 


{Bi  - Bifl  {yi  - bo  - biXi)  - bi{Xi  - Xi) 


0 as  rn 


oo. 

(A.2.20) 


Next,  we  consider 

m ( 

^ y - 6o  - b,x  + (6i  - b,){X,  -X)-  b,(X 


2i 


m 


- X 


i=l 


< E 


3m 


-E 

i=l 


(y-bo-  b,xf  + (6i  - b,f(Xi  - Xf  + 6i(X  - x) 


= 3 m' ^ 


mV(y)  + (6i  - b^f  - X)^  + b^,mV(X) 

i=l 


= 3 


^n^(a^/ni  + (T2)/'4  + £; 


(bi 


+ b 


2 

1 


< 3 


{ol  + Kal)n^^  + E{  {h  - bifm  ^ V(Xi  -Xf\+  b\alrij} 


i=l 


= 3E 


m 

^ ni{Xi  - XY 
i=l 


+ 0(1)) 


(A.2.21) 


by  assumption  (i),  (ii)  and  (iii)  of  Theorem  2.3.1. 


117 


It  remains  to  show  that 


{bi  - bi)‘^m  ^ ^ n^{Xi  - Xy 


i=l 


= 0(1),  or  equivalently 


{bi-bifMSBx  =0(1) 


(A.2.22) 


First,  we  observe  that 


(6i  - bifMSBx]  = EE  (bi  - bifMSBx\X 


= E 


V{h  - h\X)  + {E{k\X)  - biy^^MSBx 


(A.2.23) 


But,  by  assumption  (ii)  of  the  theorem. 


vik-byx)  = v{h\x) 


= y 


Y^riiyyXi  - X) 


(m  - l)MSBx 


X 


MSBx 


MSBx  - MSWx 


V 


^riiyi(Xi  - X) 


(m  - l)MSBx 


X 


< 


Enf(a^Jn,  + cr^J(X,-Xy 
(m  - 1)2M55^ 


(af  + Kal)  / (m-  1)MSB, 


Thus, 


V(bi-bijX)MSBx  < 


{aj  + Kal)  / (m  - 1) 


MSWx\ 
“ MSBx  J 


-2 


Noting  that  MSWx/MSBx  + c),  by  a uniform  integrability 


argument. 


118 


E(1  - MSWx/MSBx)  + Hence, 


V ibi-bi\X]  MSBx 


= 0{m  as  rn  — >■  oo. 


Also, 


E{h\X)-b,  = b, 


J^niXi{xi  - x) 


MSBx 


m{Xi  - X)2  MSBx  - MSWx 

Hence,  from  (A. 2. 24)  and  part  (i)  of  Lemma  2.2.1, 


- 1 


E 


{E{h\X) 


MSBx 


= b\E 


= h\E 


= b\E 


YniXj{xi  - x) 

(m  - l){MSBx  - MSWx) 

^Uijxi  - xf{W2l{m  - 1)) 

MSBx  - MSWx 


- 1 } MSBx 


1 } MSBx 


Yn.{x^-x)Y^  Wj/jm-l)  f 


m — 


MSB 


X 


m — 1 \ 


1 j yY  W“^ ^(m  — 1)  \^SBx  — MSWx 
MSBx 


MSBx  - MSx 


+ MSBx 


bi 


clal  + 1 


+ cr^  + c 


= [c7^  + c - 2 (cr^  + c)  + (T^  + c]  — 0. 


(A.2.24) 


This  proves  (A. 2. 22). 


119 


APPENDIX  : Chapter  3 


Proof  of  Lemma  3.2.1.  Let  Hi  = (xj  — fJ-x)/o'x,  i = 1, . . ■ ,m.  Then  Hi  are  iid 
7V(0, 1)  and  rii{xi  - xf  = al  YT=i  ^ We 

write  ~ " dmdYn/nT)H , where  H = {Hi, ... , Hm)'^,  -Dm  = 

Diag(ni, . . . ,rim),  and  dm  = {ni, . . ■ , rim)'^.  Then 


m 

j=i 


{xi  - xY/{m  - 1) 


— dmdml  "^t)  / 

m 

= aliriT  - ^ n?)/ (m  - 1) 

i=l 


= (^l9m/{m  - 1)  ^ cal  as  m ^ oo, 


(A.3.1) 


by  Assumption  (ii)  of  Theorem  3.2.1.  Also, 
1/ 


'^rii{xi  - xY/{m  - 1) 

,i=l 

= al{m  - 1)-V  [H'^{Dm  ~ dmd^/nr)D] 

= 2al{m  - ly^tr  [(D„  - dmdmlriTf] 


i=l 


= 2al^m  - l)-2  I]  a?  - 2 ^ nl/riT  + ^ n^/4 

_i=l  i=l 

m 

= 2al{m  - 1)“^  ^ n- (1  - ni/rirY 

i=l 

< 2cj^(m  - ^0  as  m oo, 


(A.3.2) 


by  Assumption  (i)  of  Theorem  3.2.1.  This  completes  the  proof  of  Lemma  3.2.1. 


120 


Proof  of  Theorem  3.2.1.  Let  L = (Xi, 
independence  of  (yi, . . . , ym)  with 


, and  X = (xi, . . . , Xm)-  By  the 


E{b,)  = EE 
= EE 
= hiE 
= biE 
= biEE 


|L,  X 


TZMx.-xy  _ 

Ya=\  + biXi){Xj  - X) 
YZi<x,-xY 
YJILi^iXiiXi  - X) 

TZMx,-xy 

YT=i  ^iXi{xi  - x) 


L,  X 


\x 


(A.3.3) 


,EZMx.-xy 

For  fixed  x,  we  introduce  the  orthogonal  transformation  [zi 
C(y^Xi, . . . , ^/n^Xi)^,  where  C is  an  orthogonal  matrix  with  the  hrst  two 
rows  given  by  > ^fn^NnT)  and  , • ■ ■ , ) ’ 

Then  YX^\  niXi[xi  - x)  = Z2  ^ii^i  ~ and  YlT=i  “ Xf  = 

E”=i  ".V?  - = Er.l  Z?  - Z?  = E”.2  Z?.  Hence,  E = 

E” , n.E  - x)2|H2  ^ yz^nYZ^  Z?)‘/2] , where  Z,  ~ N (( £” , ni{x,  - xf]''",  <^S) 
and  is  distributed  independently  of  (Z3, . . . , Z„)  conditional  on  x.  Next  for  ev- 


ery 3 < i < m,  writing  the  row  of  C as  {cn, . . . ,Cim),  one  gets  the  identities 


Er=i  CikV^/V^  = 0 and  Er=i  Cikrik^ixk  - x)/Er=i  M^k  - x) 
equivalently  ^^=1  = 0 and  Er=i  (^ikn]!'^{xk  - x)  = 0.  Together,  they  imply 

E^i  CikTyj'^Xk  = 0 which  is  equivalent  to  E{Zi)  = 0,  3 <i  <rn. 


;=^2V2 


= 0,  or 


121 


Next  applying  part  (ii)  of  Lemma  2.2.1  and  (A. 3. 3), 


1/2 


1/2 


E{bi\x)  = bi  i^rii{xi  - xf  \ M ^ cr^  [(2iL^  + m - 1)  ^|x] 


,i=l 


,i=l 


bi  - xf/a^j  E [{2Km  + m-  1)  ^\x] 

biu~'^  i'^ni{xi- xy/{m- 1)\  E [l  + 2{m  — l)~^Km) 


X 


, i=l 


(A.3.4) 


where  Km\x  ~ Poisson  (|  YllLi  ni{xi  — x)^/cr^)  . By  Lemma  3.2.1,  YlT=i 


xY /{ra  — 1)  ^ ccr^  as  771  — > oo,  and 


E [{m  - l)-'iL„]  = E 


^rii{xi  - xY/{m  - 1) 


_ 1=1 


c o-^(2cr^)  as  m ^ oo; 


(A.3.5) 


1/  [{m  - ly^Km]  = V 
+ E 


'^rii(xi  - xY/{m  - 1) 

,i=i 

m 

(m-  l)“2^ni( 


(2c7^)-^ 


(Xi  - X 


i=\ 


{2a, 


2)-l 

rjJ 


0 + 0 = 0 as  771— >oo. 


(A.3.6) 


Hence,  by  (A.3.5)  and  (A.3.6),  {rn  - 1)"^A"„  ^ c al{2a^)~'^  as  m ^ oo.  Thus, 
(1  + 2Km/{'m  - 1))”^  ^ (1  + cal/a'^)"'^,  and  now  by  the  dominated  convergence 
theorem,  E[l  + 2Kj{m  - 1)]“^  ^ (1  + c al/a^)-^  and  H [1  + 2Kj{m  - 1)]“^  - 
0 as  777  — > oo.  Now  writing  Jm{x)  = A [(1  + 2X^1  {m  — l))“^|x] , E[Jm{x)]  = 

E [(1  + 2Kml{m  - 1))“^]  ^ {l+cal/a'^y^  while  V[Jm{x)]  < H [(1  + 2X^1  {m  - 1)) 
0 as  m — > oo.  Thus,  Jm{x)  ^ (1  + c cj^/(T^)“h  Hence, 

A(6i|x)  ^ bia^cal{l  + cal/a'^)-^ 

^bi{cal)/{cal  + al) 


-n 


(A.3.7) 


122 


Again,  by  Assumption  (i)  of  Theorem  3.2.1, 

\E{bi\x)\  < \bi\a-‘^Y^ni{xi-xf/{m-l) 

m 

< K\bi\a-^'^{xi-x)^/{m-l). 

1 

Since  ~ - 1)  ~ (^lxL-il{rn  - 1)  and  E[xl^_J{m  - l)]^  = 

2(m  - 1)“^  + 1 < 3 for  all  m > 2,  it  follows  that  EiJ)i\x)  is  uniformly  integrable 
in  m >2.  Hence,  E{bi)  = E[E{bi\x)]  bi{c  al)/{c  as  m ^ oo.  This 

completes  the  proof  of  Theorem  3.2.1. 

Proof  of  Theorem  3.2.2.  Let  r,  = rij'^yi,  i = 1,. . . ,m.  Then  E{ri)  = y^(6o  + 
biXi)  = v^[6o  + bix  + 5i(x,  - x)]  and  V{ri)  = cr^  + n^al.  Thus,  writing 
7’  = (ri,...,rn^)  , Qjfj  = [\/^l  (2^1  2/') , . . . , ^)]  > Diag(?'i.i , • . . , ^m)  > 

E{r)  = v^(^o  + bix)um  + g„,  V (r)  = ajlm  + crlE>m,  (A.3.8) 


where  we  recall  that  = {^/n{/ ^/nr,  • • • , y/n^/ y/Er)-  Now,  by  part  (ii)  of 
Theorem  1 of  Searle  (1971,  p55). 


V{MSBy)  = 2{m  - 1)-^ 


tj-  CuDniji^Im 


+ 2nr[(6o  + bix)u^  + — UmU^)Dm{Im 


[{bQ  + hiX)Wm  + q.m\  • 


(A.3.9) 


123 


Noting  that  u^Um  = 1 and  q^Um  = it  follows  that  [(6q  + bix)u^  + q^]{Im  ~ 
UmU^)  = 0^.  Further,  by  idempotency  of  Im  — UmU^, 

bx[{o'gIm  d"  CTu-Dm)(/rn 

= a^tr{Im  ~~  d"  CT^tr{Z)m(-fm  “ d"  2a^(J^tr{lDrn{I m ” 

~ *^e(^  d"  ^u^r[D^  + DmUjji'U^Drn'^m'^rn 
d"  2o'gC7^[fr(-Drri) 

= fjg^(m  - 1)  + critr[Dl^  + {u^DmUmf  - 2u'^Dl^Um] 


/ m 

^ m 

a^(m  - 1)  + 

1 -2j]n^/nr 

+ 2alal(nT-^y  (A-3.10) 

Since  rii  < K for  all  1 < z < m,  ’^nf  + {Y^nf/nT)  — 2Y^nf/nT  < K^m+K‘^+2K'^  = 
K^{m  + 3).  Hence,  from  (A. 3. 9)  and  (A. 3. 10), 

V{MSBy)  < 2(m  - l)"^[a^(m  - 1)  + Ar^(m  + 3)cr^  + 2Kmalal] 

= 0(m~^)  0 as  m —>■  oo. 


This  completes  the  proof  of  Theorem  3.2.2. 


124 


Proof  of  Theorem  3.3.1.  We  rewrite 

= fi  {Bi- Bi){yi-y)  - Bi{y -bo-bin^) 

Hence,  by  the  elementary  inequality  (a  + b^  < 2{a^  + fe^)  and  ff  < 1, 

m m 

< ‘^’rrT^  ^ j^(5j  - Bif[yi  - yf  + B^iy  -bo  - 6i/ix)^ 


i=l 


i=l 


< 2 


m ^ '^{Bi  - Bi)'^{y,  - yf  + {y  - bo  - bifi^Y^ 


i=l 


But,  under  Assumption  (i)  of  Theorem  3.2.1, 
E{y  -bo-  biii^f  = V{y) 


— Tin 


i=l 


(A.3.11) 


< [al  + K{al  + b]al)]  /ut  -^0  as  m 
Next,  writing  M = crl/crl,  M = we  observe  that 


T7l(XXi<Ci<.rn 


1 

+ rii^l 

al  + n,al 

1 1 

1 

' 1 + UiM 

1 + riiM 

rii\M 

-M\ 

oo 


(1  + riiM){l  + UiM) 


(A.3.12) 


125 


Note  that  M ^ M.  Also,  since  - y)^  = [{m  — l)/m]MSBy 

and  MSWy  ^ cr^  + c (cr^  + h\al),  it  follows  that  m~^  - 

yf  ^ 0.  Moreover,  - B^f{;yi  - yf  < m~^  YT=i ~ v)^  = 

- n:^^d^d^)q^  where  ql^  = (yi  - bo  - biy^, . . . ,y„  - &o  - 
Dm  = Diag(ni, . . .,nm),d^  = (ni, . . . ,n„).  Hence,  E[m-^  E™i«*(yi  “ vf\  = 
2m~Hr{Im  - n^^dmdm)'^  = 0{m~^).  Thus,  sup„>im“^  “ Bi)‘^{yi  - yf  is 

uniformly  integrable  in  m and  E[m^^  ^ Bi)^{yi  — y)^]  0 as  m — > oo. 

Hence,  Theorem  3.3.1  is  established. 


REFERENCES 


Battese,  G.  E.,  Harter,  R.  M.  and  Fuller,  W.  A.  (1988).  An  error-components 
model  for  prediction  of  county  crop  areas  using  survey  and  satellite  data. 
Journal  of  the  American  Statistical  Association  83:  28-36. 

Bolfarine,  H.  and  Zacks,  S.  (1992).  Prediction  theory  for  finite  populations, 
Springer-Verlag  Inc. 

Casella,  G.  (1985).  An  introduction  to  empirical  Bayes  data  analysis.  The  Ameri- 
can Statistician  39:  83-87. 

Casella,  G.  and  George,  E.  I.  (1992).  Explaining  the  Gibbs  sampler.  The  American 
Statistician  46:  167-174. 

Ghaudhuri,  A.  (1994).  Small  domain  statistics:  A review,  Statistica  Neerlandica 
48:  215-236. 

Gox,  D.  R.  and  Snell,  E.  J.  (1968).  A general  definition  of  residuals,  Journal  of  the 
Royal  Statistical  Society,  Series  B,  Methodological  30:  248-275. 

Gressie,  N.  (1990).  Small-area  prediction  of  undercount  using  the  general  linear 
model,  Statistics  Canada  Symposium,  pp.  93-105. 

Datta,  G.  S.  and  Ghosh,  M.  (1991).  Bayesian  prediction  in  linear  models:  Applica- 
tions to  small  area  estimation,  The  Annals  of  Statistics  19:  1748-1770. 

Datta,  G.  S.  and  Ghosh,  M.  (1996).  On  the  invariance  of  noninformative  priors. 
The  Annals  of  Statistics  24:  141-159. 

Datta,  G.  S.  and  Lahiri,  P.  (2000).  A unified  measure  of  uncertainty  of  estimated 
best  linear  unbiased  predictors  in  small  area  estimation  problems,  Statistica 
Sinica  10(2):  613-627. 

Ericksen,  E.  P.  (1974).  A regression  method  for  estimating  population  changes  of 
local  areas.  Journal  of  the  American  Statistical  Association  69:  867-875. 

Ericksen,  E.  P.  and  Kadane,  J.  B.  (1987).  Sensitivity  analysis  of  local  estimates  of 
undercount  in  the  1980  U.S.  Census,  Small  Area  Statistics:  An  International 
Symposium,  pp.  23-45. 

Fay,  R.  E.  (1987).  Application  of  multivariate  regression  to  small  domain  estima- 
tion, Small  Area  Statistics:  An  International  Symposium,  pp.  91-102. 


126 


127 


Fay,  Robert  E.,  I.  and  Herriot,  R.  A.  (1979).  Estimates  of  income  for  small  places: 
An  application  of  James-Stein  procedures  to  census  data,  Journal  of  the 
American  Statistical  Association  74:  269-277. 

Fuller,  W.  A.  (1987).  Measurement  error  models,  John  Wiley  & Sons. 

Fuller,  W.  A.  and  Harter,  R.  M.  (1987).  The  multivariate  components  of  variance 
model  for  small  area  estimation.  Small  Area  Statistics:  An  International 
Symposium,  pp.  103-123. 

Ghosh,  M.  and  Meeden,  G.  (1986).  Empirical  Bayes  estimation  in  finite  population 
sampling.  Journal  of  the  American  Statistical  Association  81:  1058-1062. 

Ghosh,  M.  and  Rao,  J.  N.  K.  (1994).  Small  area  estimation:  An  appraisal  (Disc: 
p76-93),  Statistical  Science  9:  55-76. 

Ghosh,  M.,  Nangia,  N.  and  Kim,  D.  H.  (1996).  Estimation  of  median  income 
of  four-person  families:  A Bayesian  time  series  approach,  Journal  of  the 
American  Statistical  Association  91:  1423-1431. 

Ghosh,  M.,  Natarajan,  K.,  Stroud,  T.  W.  F.  and  Carlin,  B.  P.  (1998).  Generalized 
linear  models  for  small-area  estimation,  Journal  of  the  American  Statistical 
Association  93:  273-282. 

Godambe,  V.  P.  and  Thompson,  M.  E.  (1989).  An  extension  of  quasi-likelihood 
estimation.  Journal  of  Statistical  Planning  and  Inference  22:  137-152. 

Meeden,  G.  and  Ghosh,  M.  (1997).  Bayesian  methods  in  finite  population  sampling. 
Chapman  & Hall  Ltd. 

Morris,  C.  N.  (1982).  Natural  exponential  families  with  quadratic  variance 
functions.  The  Annals  of  Statistics  10:  65-80. 

Morris,  C.  N.  (1983).  Natural  exponential  families  with  quadratic  variance 
functions:  Statistical  theory.  The  Annals  of  Statistics  11:  515-529. 

Pfeffermann,  D.  and  Burck,  L.  (1990).  Robust  small  area  estimation  combining 
time  series  and  cross-sectional  data.  Survey  Methodology  16:  217-237. 

Prasad,  N.  G.  N.  and  Rao,  J.  N.  K.  (1990).  The  estimation  of  the  mean  squared 
error  of  small-area  estimators,  Journal  of  the  American  Statistical  Association 
85:  163-171. 

Rao,  J.  (2003).  Small  Area  Estimation,  John  Wiley  k,  Sons. 

Rao,  J.  N.  K.  and  Yu,  M.  (1992).  Small  area  estimation  by  combining  time  series 
and  cross-sectional  data,  ASA  Proceedings  of  the  Section  on  Survey  Research 
Methods,  pp.  1-9. 


128 


Robbins,  H.  (1956).  An  empirical  bayes  approach  to  statistics,  Proceedings  of  the 
Third  Berkeley  Symposium  on  Statistics  and  Probability^  pp.  157-164. 

Robert,  C.  P.  and  Casella,  G.  (1999).  Monte  Carlo  statistical  methods,  Springer- 
Verlag  Inc. 

Sarkar,  S.  and  Ghosh,  M.  (1998).  Empirical  Bayes  estimation  of  local  area 

means  for  NEF-QVF  superpopulations,  Sankhya,  Series  B,  Indian  Journal  of 
Statistics  60:  464-487. 

Schaible,  W.  L.  (1978).  Ghoosing  weights  for  composite  estimators  for  small 
area  statistics,  ASA  Proceedings  of  the  Section  on  Survey  Research  Methods, 
pp.  741-746. 

Searle,  S.  R.  (1997).  Linear  models,  John  Wiley  & Sons. 

Small  Area  Statistics:  An  International  Symposium  (1987).  John  Wiley  & Sons. 

Small  Area  Statistics:  Contributed  Papers  (1986).  Laboratory  for  Research  in 
Statistics  and  Probability,  Garleton  University. 

Whittemore,  A.  S.  (1989).  Errors-in- variables  regression  using  Stein  estimates 
(Gom:  90V44  p263-264),  The  American  Statistician  43:  226-228. 


BIOGRAPHICAL  SKETCH 


Karabi  Sinha  was  born  in  the  city  of  Salvador,  Brazil  on  the  10th  of  May, 
1976.  She  graduated  with  a Bachelor’s  Degree  from  Calcutta  University  in  1998 
with  a First  Class  Honours  in  Statistics.  Then  she  completed  First  Year  M Sc 
(Statistics)  in  Calcutta  University  before  moving  to  the  University  of  Florida  in 
the  Fall  of  1999.  She  received  her  master’s  degree  in  Statistics  from  UFL  in  2001. 
Upon  graduation  with  her  Ph.D.,  she  will  join  the  University  of  Illinois  at  Chicago 
as  an  Assistant  Professor  of  Biostatistics  in  the  School  of  Public  Health. 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and 
quality,  as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Malay  Ghosh,  Chair 
Distinguished  Professor  of  Statistics 

I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and 
quality,  as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Ronald  Randles, 

Professor  of  Statistics 

I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and 
quality,  as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


George  Casella 

Distinguished  Professor  of  Statistics 

I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and 
quality,  as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Cyndi  Garvan 

Research  Assistant  Professor  of  Statistics 

I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and 
quality,  as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Bruce  Vogel 

Associate  Professor  of  Health  Services 
Administration 


This  dissertation  was  submitted  to  the  Graduate  Faculty  of  the  Department  of 
Statistics  in  the  College  of  Liberal  Arts  and  Sciences  and  to  the  Graduate  School 
and  was  accepted  as  partial  fulfillment  of  the  requirements  for  the  degree  of  Doctor 
of  Philosophy. 

August  2004  — 

Dean,  Graduate  School 


SOME  CONTRIBUTIONS  TO  SMALL  AREA  ESTIMATION 

Karabi  Sinha 
(352)  271-8294 
Department  of  Statistics 
Chair:  Malay  Ghosh 
Degree:  Doctor  of  Philosophy 
Graduation  Date:  August  2004 

This  dissertation  looks  at  the  existing  facets  of  small  area  estimation  and 
considers  contribution  to  some  particular  areas. 

One  such  instance  is  the  problem  of  estimation  in  the  small  area  setup  where 
the  covariates  are  measured  with  error.  In  other  words,  it  considers  the  role  of 
measurement  error  models  in  small  area  estimation. 

In  the  majority  of  this  dissertation,  we  consider  simultaneous  estimation  of 
finite  population  means  for  several  strata  based  on  two  different  model  structures 
and  assumptions.  In  each  consideration,  a model-based  approach  is  taken,  where 
the  covariates  in  the  super-population  model  are  subject  to  measurement  errors. 

In  the  first  set-up,  EB  estimators  of  the  strata  means  are  developed  and  an 
asymptotic  expression  of  the  Mean  Square  Error  of  the  vector  of  EB  estimators  is 
attained.  In  the  second  set-up,  we  consider  developing  both  EB  and  HB  estimators 
of  the  strata  means.  In  both  cases,  findings  are  supported  by  appropriate  data 
analyses  and  are  further  validated  by  simulation  studies. 

Also,  in  this  dissertation,  we  have  considered  small  domain  estimates  of  health 
insurance  coverage  of  minority  superpopulations.  We  have  considered  here  a 
design-assisted  model-based  approach.  Both  the  EB  and  the  HB  estimators  are 
developed  and  asscociated  measures  of  precision  are  also  found. 


